Page 1 of 3 123 LastLast
Results 1 to 10 of 23

Thread: Factoring 1

  1. #1
    thinBasic MVPs danbaron's Avatar
    Join Date
    Jan 2010
    Location
    California
    Posts
    1,378
    Rep Power
    152

    Factoring 1

    I'm posting this in the science sub-forum. That way, I may be less vulnerable to attacks for writing the code of this topic in Racket.

    This topic is about factoring composite integers. I became interested in it from Johannes' work on big integers, factoring composites, and primes.

    Does this topic qualify as science? I think it does. Factoring is part of mathematics, and mathematics is the basis of science. And certainly, factoring done by computer, is part of computer science - and, computer science is part of science.

    The code and output below use Racket, which is a descendant of Scheme, which is a dialect of Lisp.

    I'm calling this topic, "Factoring 1", because, the algorithm I used is the most basic factoring algorithm.

    Every positive integer, n, can be factored into k primes, for some value of k (k >= 1), such that the product of the k primes, equals n.

    The algorithm works by dividing the integer (n) to factor, by members of a list containing the first y primes. If the y-th prime is at least as big as sqrt(n), then, the list is guaranteed to be able to factor n.

    I arbitrarily decided to make a list of all of the primes less than or equal to the first prime greater than 10,000,000. That list is saved to file, and can be used again and again. Being able to read and write the list to file quickly, depended upon the ability of Racket to "serialize" and "deserialize" lists. The list is read and written as one unit, and the actions are very fast.

    (When you use a computer to make a list of the first y primes, for some value of y, you soon realize that each succeeding prime takes longer to calculate than the previous one. The rate of growth of the list is always decreasing. At first, the list seems to grow very fast. But soon you notice that it is growing more slowly. Actually, as y approaches infinity, so does the time to calculate the y-th prime. But, theoretically, it is always possible to find the y+1-th prime, if you have a list of the first y primes. So, every prime that has ever been found, or ever will be found, could be found by constructing a list that begins with 2, and then repeatedly calculates the next prime and appends it to the list.)

    So, the first thing I had to do was to make the list of primes. There is a procedure below, "set-initial-primes-file". It serialized and saved to a file called, "prime-list.txt", just the first four primes, 2, 3, 5, 7. I needed that little file, so I had something I could append to.

    After that, the procedure, "append-primes-file", was used to increase the size of the list (file). It takes a parameter, "total-time", which determines how long the procedure runs. As written, the units of, "total-time", are hours. I ran the procedure numerous times, and each time it ran, the list (file) grew, until it reached its finished size. I think it took approximately 16 total hours to complete the file.

    It turns out that the first prime greater than 10,000,000, is, 10,000,019. The complete list contains the first 664,580 primes, and its associated file size is 6.25 MB.

    Once I had the completed file, "primes-list.txt", I could use it to factor composite integers, or to verify that an integer is prime.

    The procedure that factors a number is called, "list-factor". I called it that, because, it factors a number by using a list of primes.

    Since the list's largest prime, is 10,000,019, "list-factor" can factor any number that contains no factor larger than, (10,000,019)^2, or, 100,000,380,000,361. It means that the procedure can factor any number containing no factor larger than 14 digits.

    ******************************************************************************************************************************************************

    What happens if the procedure encounters a number (or a quotient of factors) that is bigger than the 15 digit number (10000019)^2, i.e., 100000380000361?

    For instance, the 15 digit number, 688846502588399 is greater than 100000380000361, and it happens to be prime.

    However, what happens if we multiply together the next two primes after, 10000019, i.e., 10000079 * 10000103?

    We get another 15 digit number, 100001820008137, which is also greater than 100000380000361, but, it is composite.

    Can the procedure determine that 688846502588399 is prime, and that 100001820008137 is composite?

    It cannot make the determination.

    The procedure works by taking floor(sqrt(n)), where n is the number (or quotient of a partial factorization) being factored.

    So, for 688846502588399, it finds, floor(sqrt(688846502588399)), equals 26245885.

    And for, 100001820008137, it finds, floor(sqrt(100001820008137)), equals 10000090.

    Both "roots" are bigger than the the biggest prime in the list, 10000019.

    So, in both cases, the procedure starts at the beginning of the primes list with 2, and divides each number by each prime in the list. If the remainder of a division is 0, then, that prime divisor is one of the number's factors.

    The procedure will stop before reaching the end of the primes list if it hasn't found a factor, and a candidate divisor is greater than the number's "root" (floor(sqrt(n))). (It is impossible for the smaller/smallest factor of an integer, n, to be larger than floor(sqrt(n)). For instance, consider n to be 21. floor(sqrt(21)) = 4. Therefore, if 21 is not prime, then, it must have a factor <= 4. And it does, 3. 3 * 7 = 21. You could look at it like this. If n = r * r, where r is the square root of n, and if n = a * b, where a and b are factors of n, then, if b is greater than r, a must be less than r.)

    If the procedure tries every prime in the list up to its "root", and doesn't find a factor, then, the number must be prime.

    (Incidentally, sqrt(21) = 4.58257569495584.. floor(sqrt(21)) = 4. ceiling(sqrt(21)) = 5. We are working with integers, so we have to use either 4 or 5 as potential factors. And, 5 * 5 = 25, would be greater than 21, so we use 4 as the maximum, i.e., floor(sqrt(n)).)

    In the cases of the two numbers above, in both cases, floor(sqrt(n)) is larger than the last prime in the list, 1000019. So, unless the procedure finds a factor, it will try dividing the numbers by every prime in the list. And in both cases, it will never find a factor. For 688846502588399, there are no factors (besides 1 and itself), because it is prime. For 100001820008137, neither of its factors (10000079 and 10000103) is in the list, because they are the next two primes after the list's largest prime, 100000019.

    In summary, when the procedure encounters a number (or a quotient of factors) that is larger than (10000019)^2, which it is unable to factor, it cannot say whether it is prime or composite. At that point, the procedure must stop.

    ----------------------------------------------------------------

    What do I mean by a "quotient of factors", or, a "quotient of a partial factorization"?

    Consider the integer, 2815417672486545963.

    If the procedure factors it, the first factor found will be 3.

    So, the procedure will divide the number by 3, getting the "quotient of the first factor", 938472557495515321.

    Then, the procedure will attempt to factor 938472557495515321, and it will find the second factor, 29.

    The procedure will divide 938472557495515321 by 29, getting the "quotient of the second factor", 32361122672259149.

    Now, the procedure will try to factor the "quotient of factors", 32361122672259149.

    But here, it has reached a dead end. 32361122672259149 is greater than (10000019)^2, and, it has no factors in the primes list.

    So, the procedure can only say that it has partially factored 2815417672486545963. It cannot say whether 32361122672259149 is prime or composite.

    ******************************************************************************************************************************************************

    In order to factor numbers, I run the code from within the DrRacket IDE. Doing that, compiles and loads the procedures. Then, I execute the procedure, "read-primes-file", from DrRacket's REPL (read-eval-print loop). That loads the list of primes. After that, I can factor numbers by executing the procedure, "list-factor".

    I made two more procedures, "time-read-primes-file", and, "time-list-factor". They time the "read-primes-file" and the "list-factor" procedures. The displayed times are in seconds.

    Below is shown the code, and example interactions using the REPL.

    (To factor bigger numbers, I will have to use another method.)

    (One more thing. If you want to instantaneously become at least a billionaire, all you need to do is to develop a procedure that will factor an arbitrary 1000 digit decimal integer, on an average computer, in one second.)



    ; code -------------------------------------------------------------------------------
    
    #lang racket
    (require racket/serialize)
    
    (define pl (list))
    (define start-time 0)
    
    (define (set-start-time)
      (set! start-time (current-milliseconds)))
    
    (define (elapsed-time)
      (/ (- (current-milliseconds) start-time) 3600000.0))
    
    (define (set-initial-primes-file)
      (define out (open-output-file "primes-list.txt" #:exists 'replace))
      (define l (list 2 3 5 7))
      (write (serialize l) out)
      (close-output-port out))
    
    (define (append-primes-file total-time)
      (read-primes-file)
      (set-start-time)
      (displayln (length pl))
      (define last-prime (last pl))
      (define (test-next-number n)
        (cond
          ((> (last pl) 10000000) void)
          ((even? n) (test-next-number (+ n 1)))
          ((> (elapsed-time) total-time) void)
          (else
           (define n-root (floor (sqrt n)))
           (define result (check-list n n-root pl))
           (when result (set! pl (append pl (list n))))
           (test-next-number (+ n 1)))))
      (define (check-list  test test-root l)
        (define check (car l))
        (cond
          ((null? l) #t)
          ((> check test-root) #t)
          ((= (modulo test check) 0) #f)
          (else (check-list test test-root (cdr l)))))
      (test-next-number (+ last-prime 1))
      (write-primes-file))
    
    (define (write-primes-file)
      (define out (open-output-file "primes-list.txt" #:exists 'replace))
      (write (serialize pl) out)
      (close-output-port out))
    
    (define (read-primes-file)
      (define in (open-input-file "primes-list.txt"))
      (set! pl (deserialize (read in)))
      (close-input-port in))
    
    (define (list-factor n)
      (define factors (list))  
      (define (test-factors n1 l)
        (cond
          ((and (null? l) (null? factors)) 
           (displayln "The procedure cannot factor this number.")
           (displayln "Either it is a prime greater than 100,000,380,000,361,")
           (displayln "or it is a composite containing factors, all of which are greater than 10,000,019."))
          ((and (null? l) (not (null? factors)))
           (set! factors (append factors (list n1)))
           (displayln "The procedure cannot fully factor this number.")
           (displayln "The last factor is either a prime greater than 100,000,380,000,361,")
           (displayln "or it is a composite containing sub-factors, all of which are greater than 10,000,019.")
           factors)
          (else
           (define test (car l))
           (define sr (floor (sqrt n1)))
           (cond
             ((= (modulo n1 test) 0)
              (set! factors (append factors (list test)))
              (check-duplicates n1 test l))
             ((> test sr)
              (set! factors (append factors (list n1)))
              factors)          
             (else (test-factors n1 (cdr l)))))))  
      (define (check-duplicates n2 t l1)
        (define n3 (/ n2 t))
        (cond
          ((= n3 1) factors)
          ((= (modulo n3 t) 0)
           (set! factors (append factors (list t)))
           (check-duplicates n3 t l1))
          (else (test-factors n3 (cdr l1)))))
      (cond
        ((= n 0)
         (list 0))
        ((= n 1)
         (list 1))
        (else (test-factors n pl))))
    
    (define (time-read-primes-file)
      (let ((t1 (current-milliseconds)))
        (read-primes-file)
        (let ((t2 (current-milliseconds)))
          (let ((tt (/ (- t2 t1) 1000.0)))
            tt))))
    
    (define (time-list-factor n)
      (let ((t1 (current-milliseconds)))
        (let ((factor-list (list-factor n)))
          (let ((t2 (current-milliseconds)))
            (let ((tt (/ (- t2 t1) 1000.0)))
              (values factor-list tt))))))
    
    ; REPL interactions ------------------------------------------------------------------
    
    Welcome to DrRacket, version 5.1 [3m].
    Language: racket.
    > (time-read-primes-file)
    4.009
    > (list-factor 0)
    '(0)
    > (list-factor 1)
    '(1)
    > (list-factor 2)
    '(2)
    > (list-factor 3)
    '(3)
    > (list-factor 4)
    '(2 2)
    > (list-factor 5)
    '(5)
    > (list-factor 6)
    '(2 3)
    > (list-factor 1024)
    '(2 2 2 2 2 2 2 2 2 2)
    > (time-list-factor 111111111)
    '(3 3 37 333667)
    0.001
    > (time-list-factor 1111111111)
    '(11 41 271 9091)
    0.001
    > (time-list-factor 11111111111)
    '(21649 513239)
    0.319
    > (time-list-factor 111111111111)
    '(3 7 11 13 37 101 9901)
    0.001
    > (time-list-factor 1111111111111)
    '(53 79 265371653)
    0.002
    > (time-list-factor 11111111111111)
    '(11 239 4649 909091)
    0.001
    > (time-list-factor 111111111111111)
    '(3 31 37 41 271 2906161)
    0.001
    > (time-list-factor 1111111111111111)
    '(11 17 73 101 137 5882353)
    0.001
    > (time-list-factor 11111111111111111)
    '(2071723 5363222357)
    24.243
    > (time-list-factor 111111111111111111)
    '(3 3 7 11 13 19 37 52579 333667)
    0.536
    > (time-list-factor 1111111111111111111)
    The procedure cannot factor this number.
    Either it is a prime greater than 100,000,380,000,361,
    or it is a composite containing factors, all of which are greater than 10,000,019.
    97.527
    > (time-list-factor 11111111111111111111)
    '(11 41 101 271 3541 9091 27961)
    1.628
    > (time-list-factor 111111111111111111111)
    '(3 37 43 239 1933 4649 10838689)
    1.636
    > (time-list-factor 1111111111111111111111)
    '(11 11 23 4093 8779 21649 513239)
    0.21
    > (time-list-factor 11111111111111111111111)
    The procedure cannot factor this number.
    Either it is a prime greater than 100,000,380,000,361,
    or it is a composite containing factors, all of which are greater than 10,000,019.
    107.971
    > (time-list-factor 111111111111111111111111)
    '(3 7 11 13 37 73 101 137 9901 99990001)
    0.125
    > (time-list-factor 1111111111111111111111111)
    '(41 271 21401 25601 182521213001)
    6.208
    > (time-list-factor 11111111111111111111111111)
    The procedure cannot fully factor this number.
    The last factor is either a prime greater than 100,000,380,000,361,
    or it is a composite containing sub-factors, all of which are greater than 10,000,019.
    '(11 53 79 859 280846283204599997)
    91.478
    > (time-list-factor 111111111111111111111111111)
    The procedure cannot fully factor this number.
    The last factor is either a prime greater than 100,000,380,000,361,
    or it is a composite containing sub-factors, all of which are greater than 10,000,019.
    '(3 3 3 37 757 333667 440334654777631)
    87.828
    > (time-list-factor 1111111111111111111111111111)
    '(11 29 101 239 281 4649 909091 121499449)
    10.889
    > (time-list-factor 11111111111111111111111111111)
    '(3191 16763 43037 62003 77843839397)
    2.168
    > (time-list-factor 111111111111111111111111111111)
    '(3 7 11 13 31 37 41 211 241 271 2161 9091 2906161)
    0.14
    > (time-list-factor 1111111111111111111111111111111)
    The procedure cannot fully factor this number.
    The last factor is either a prime greater than 100,000,380,000,361,
    or it is a composite containing sub-factors, all of which are greater than 10,000,019.
    '(2791 6943319 57336415063790604359)
    94.458
    > (time-list-factor 11111111111111111111111111111111)
    '(11 17 73 101 137 353 449 641 1409 69857 5882353)
    0.593
    > (time-list-factor 111111111111111111111111111111111)
    The procedure cannot fully factor this number.
    The last factor is either a prime greater than 100,000,380,000,361,
    or it is a composite containing sub-factors, all of which are greater than 10,000,019.
    '(3 37 67 21649 513239 1344628210313298373)
    91.104
    > (time-list-factor 1111111111111111111111111111111111)
    The procedure cannot fully factor this number.
    The last factor is either a prime greater than 100,000,380,000,361,
    or it is a composite containing sub-factors, all of which are greater than 10,000,019.
    '(11 103 4013 2071723 117957818840753430733)
    92.212
    > (time-list-factor 11111111111111111111111111111111111)
    The procedure cannot fully factor this number.
    The last factor is either a prime greater than 100,000,380,000,361,
    or it is a composite containing sub-factors, all of which are greater than 10,000,019.
    '(41 71 239 271 4649 123551 102598800232111471)
    94.068
    > (time-list-factor 111111111111111111111111111111111111)
    '(3 3 7 11 13 19 37 101 9901 52579 333667 999999000001)
    12.855
    > (time-list-factor 1111111111111111111111111111111111111)
    The procedure cannot fully factor this number.
    The last factor is either a prime greater than 100,000,380,000,361,
    or it is a composite containing sub-factors, all of which are greater than 10,000,019.
    '(2028119 547853016076034547830334961169)
    88.655
    > (time-list-factor 11111111111111111111111111111111111111)
    The procedure cannot fully factor this number.
    The last factor is either a prime greater than 100,000,380,000,361,
    or it is a composite containing sub-factors, all of which are greater than 10,000,019.
    '(11 1010101010101010101010101010101010101)
    93.319
    > (time-list-factor 111111111111111111111111111111111111111)
    The procedure cannot fully factor this number.
    The last factor is either a prime greater than 100,000,380,000,361,
    or it is a composite containing sub-factors, all of which are greater than 10,000,019.
    '(3 37 53 79 239073561261285168617387389778123)
    94.863
    > (time-list-factor 1111111111111111111111111111111111111111)
    '(11 41 73 101 137 271 3541 9091 27961 1676321 5964848081)
    15.662
    > (time-list-factor 11111111111111111111111111111111111111111)
    The procedure cannot fully factor this number.
    The last factor is either a prime greater than 100,000,380,000,361,
    or it is a composite containing sub-factors, all of which are greater than 10,000,019.
    '(83 1231 538987 201763709900322803748657942361)
    94.13
    > (time-list-factor 1111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111)
    The procedure cannot fully factor this number.
    The last factor is either a prime greater than 100,000,380,000,361,
    or it is a composite containing sub-factors, all of which are greater than 10,000,019.
    '(11 41 101 251 271 3541 5051 9091 21401 25601 27961 60101 7019801 341233306557836423189042926585457900151074303303755301)
    90.121
    > (time-list-factor 11111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111)
    The procedure cannot fully factor this number.
    The last factor is either a prime greater than 100,000,380,000,361,
    or it is a composite containing sub-factors, all of which are greater than 10,000,019.
    '(11
      41
      73
      101
      137
      251
      271
      401
      1201
      1601
      3541
      5051
      9091
      21401
      25601
      27961
      60101
      1676321
      7019801
      263980647407951828155827620420447036038246324232161580581998828957153987896442603868348285938940685117797680817907519399405310559008181)
    97.999
    >
    
    Last edited by danbaron; 06-05-2011 at 06:58.
    "You can't cheat an honest man. Never give a sucker an even break, or smarten up a chump." - W.C.Fields

  2. #2
    Member Johannes's Avatar
    Join Date
    Nov 2010
    Location
    Wuustwezel, Belgium
    Age
    56
    Posts
    95
    Rep Power
    25
    Quote Originally Posted by danbaron View Post
    It turns out that the first prime greater than 10,000,000, is, 10,000,019. The complete list contains the first 664,580 primes, and its associated file size is 6.25 MB.
    That's an average of almost 10 bytes per prime. Does Racket store integers (or numbers in general) as decimal ASCII?

    I have to say I'm very impressed with Racket's code density. I have a little trouble reading it. (I know you do MODs using the values from prime list pl but I wouldn't want to have to change anything in your program.)

    I'm tempted. You know I am.
    Boole and Turing, help me!

    Primary programming: 200 MHz ARM StrongARM, RISC OS 4.02, BASIC V, ARM assembler.
    Secondary programming: 3.16 GHz Intel Core 2 Duo E8500, Vista Home Premium SP2, thinBasic, x86 assembler.

  3. #3
    Member Johannes's Avatar
    Join Date
    Nov 2010
    Location
    Wuustwezel, Belgium
    Age
    56
    Posts
    95
    Rep Power
    25
    Quote Originally Posted by danbaron View Post
    (One more thing. If you want to instantaneously become at least a billionaire, all you need to do is to develop a procedure that will factor an arbitrary 1000 digit decimal integer, on an average computer, in one second.)
    As it stands now, the best algorithm to factor large numbers is the General number field sieve.

    The running time of the number field sieve is super-polynomial but sub-exponential in the size of the input.
    If you developed an algorithm to do it in polynomial time you'd get the Nobel Prize in mathematics, guaranteed. If you can do it in O(n^2) they'd probably make you king of a country of your choice.

    If the CIA or FSB didn't kill you first.
    Boole and Turing, help me!

    Primary programming: 200 MHz ARM StrongARM, RISC OS 4.02, BASIC V, ARM assembler.
    Secondary programming: 3.16 GHz Intel Core 2 Duo E8500, Vista Home Premium SP2, thinBasic, x86 assembler.

  4. #4
    thinBasic MVPs danbaron's Avatar
    Join Date
    Jan 2010
    Location
    California
    Posts
    1,378
    Rep Power
    152
    Racket is basically Scheme. I didn't know anything about Scheme. jack clued me into it.

    http://www.thinbasic.com/community/s...ht=lisp+videos

    (I think what they refer to as Lisp in the videos, is actually Scheme.)

    I am convinced that Scheme is a very smart language. And, I don't think there is a better implementation of it than Racket. In fact, I have also downloaded and tried both the Allegro and LispWorks implementations of Common Lisp. They are both expensive commercial products. I swear that Racket (absolutely free) is better than both of them. (On the Internet you can quickly learn that, often, how much something costs has no relation to how much it is worth.) (Of course, you are correct about MODs and pl. And, you can change anything you want. What does it hurt? It's free, as in, "free beer!".)

    Here's a tip. In Scheme and Common Lisp, "(car l)", means the first element of the list, l. "(cdr l)", means the remainder of the list (with (car l) removed). The names, car and cdr, are artifacts of the distant past, which come from the architecture of the computer that Lisp was first developed on.

    http://en.wikipedia.org/wiki/CAR_and_CDR


    ---------------------------------------------------------------------------------------

    Here is the beginning of the 6.25 MB file, "primes-list.txt".

    ((2) 0 () 0 () () (c 2 c 3 c 5 c 7 c 11 c 13 c 17 c 19 c 23 c 29 c 31 c 37 c 41 c 43 c 47 c 53 c 59 c 61 c 67 c 71 c 73 c 79 c 83 c 89 c 97 c 101 c 103 c 107 c 109 c 113 c 127 c 131 c 137 c 139 c 149 c 151 c 157 c 163 c 167 c 173 c 179 c 181 c 191 c 193 c 197 c 199 c 211 c 223 c 227 c 229 c 233 c 239 c 241 c 251 c 257 c 263 c 269 c 271 c 277 c 281 c 283 c 293 c 307 c 311 c 313 c 317 c 331 c 337 c 347 c 349 c 353 c 359 c 367 c 373 c 379 c 383 c 389 c 397 c 401 c 409 c 419 c 421 c 431 c 433 c 439 c 443 c 449 c 457 c 461 c 463 c 467 c 479 c 487 c 491 c 499 c 503 c 509 c 521 c 523 c 541 c 547 c 557 c 563 c 569 c 571 c 577 c 587 c 593 c 599 c 601 c 607 c 613 c 617 c 619 c 631 c 641 c 643 c 647 c 653 c 659 c 661 c 673 c 677 c 683 c 691 c 701 c 709 c 719 c 727 c 733 c 739 c 743 c 751 c 757 c 761 c 769 c 773 c 787 c 797 c 809 c 811 c 821 c 823 c 827 c 829 c 839 c 853 c 857 c 859 c 863 c 877 c 881 c 883 c 887 c 907 c 911 c 919 c 929 c 937 c 941 c 947 c 953 c 967 c 971 c 977 c 983 c 991 c 997 c 1009 c 1013 c 1019 c 1021 c 1031 c 1033 c 1039 c

    And here is the end of it.

    9998701 c 9998719 c 9998741 c 9998743 c 9998749 c 9998753 c 9998777 c 9998797 c 9998801 c 9998809 c 9998851 c 9998861 c 9998867 c 9998887 c 9998893 c 9998903 c 9998929 c 9998969 c 9998971 c 9998977 c 9999047 c 9999049 c 9999053 c 9999071 c 9999083 c 9999161 c 9999163 c 9999167 c 9999193 c 9999217 c 9999221 c 9999233 c 9999271 c 9999277 c 9999289 c 9999299 c 9999317 c 9999337 c 9999347 c 9999397 c 9999401 c 9999419 c 9999433 c 9999463 c 9999469 c 9999481 c 9999511 c 9999533 c 9999593 c 9999601 c 9999637 c 9999653 c 9999659 c 9999667 c 9999677 c 9999713 c 9999739 c 9999749 c 9999761 c 9999823 c 9999863 c 9999877 c 9999883 c 9999889 c 9999901 c 9999907 c 9999929 c 9999931 c 9999937 c 9999943 c 9999971 c 9999973 c 9999991 c 10000019))

    ---------------------------------------------------------------------------------------

    There is no Nobel Prize in mathematics.

    http://nobelprizes.com/nobel/why_no_math.html

    http://en.wikipedia.org/wiki/Fields_Medal

    ---------------------------------------------------------------------------------------

    I looked at the Wikipedia article about the General number field sieve. My first impression is that I am not optimistic about being able to correctly do it. But, there are other methods between what I did and it. Maybe I can do one or more of those.

    ---------------------------------------------------------------------------------------

    As far as me trying to increase the size of my primes list, I think that is a dead end.

    To estimate the number of primes less than or equal to x, we use the formula pi(x) ~= x / ln(x).

    For x = 10^7, that is 620,421 (but actually there are 664,579, the biggest prime in the list is 10^7 + 19).

    For x = 10^8, that is 5,428,681.

    So, just to increase my list so that the biggest prime was greater than 10^8, I would have to add approximately 5,428,681 - 664,580 = 4,764,101 primes.

    It took approximately 16 hours to make the list as it is, with the first 664,580 primes. Assuming that the rate is the same, it would take approximately 4,764,101 / 664,580 * 16 = 115 hours to get the first prime greater than 10^8. But, of course, even worse is the fact that the rate at which the list grows, continually decreases. (I guess we could figure out the equation for the rate.)

    Additionally, the file would at least increase in size to approximately 5,428,681 / 664,580 * 6.25 = 51.05 MB. But, of course, even worse is the fact that on average, each additional prime requires more storage space than the previous one.

    And, even if I was able to make the list contain the first prime greater than 10^8, I would only be guaranteed of being able to factor any number with no factor larger than 16 digits. Already, I can factor any number with no factor larger than 14 digits.

    How about if we wanted to make the list contain all of the primes up to 10^9?

    10^9 / ln(10^9) = 48,254,942

    The floor estimate for the size of that file would be 48,254,942 / 664,580 * 6.25 = 453.81 MB.

    And then, we would only be guaranteed of being able to factor any number with no factor larger than 18 digits.

    I think absolutely, now or in the future, the method of finding the first y primes, will confound any computer. Even if at the start it is spitting out primes at the rate of 10^50 per second, sooner or later it will slow to one prime per year, and then one prime per million years, etc.

    ---------------------------------------------------------------------------------------

    Absolutely, if you made a method to do instantaneous factoring, you would have to proceed very very carefully. You would have to write it down and hide it somewhere, maybe out in the forest under a rock. You would have to destroy all other traces of it. If you did those things, maybe you could walk into the CIA and negotiate. (But, there's no guarantees in a situation like this, right? You might find yourself on the way to Guantanamo. Maybe if you agreed to split the money with a high government official, you would have a protector, who knows?)


    Last edited by danbaron; 01-05-2011 at 20:00.
    "You can't cheat an honest man. Never give a sucker an even break, or smarten up a chump." - W.C.Fields

  5. #5
    thinBasic MVPs danbaron's Avatar
    Join Date
    Jan 2010
    Location
    California
    Posts
    1,378
    Rep Power
    152
    I changed the code and REPL interactions a little bit (above).

    (Better not to be greedy, right?)





    "You can't cheat an honest man. Never give a sucker an even break, or smarten up a chump." - W.C.Fields

  6. #6
    thinBasic MVPs danbaron's Avatar
    Join Date
    Jan 2010
    Location
    California
    Posts
    1,378
    Rep Power
    152
    If you notice above in the REPL interactions, the code could not even partially factor, two of the integers I tried,

    1111111111111111111 (19 ones),

    11111111111111111111111 (23 ones).

    I went to, http://www.wolframalpha.com/ .

    I tried to factor them both using the commands, FactorInteger[1111111111111111111], and, FactorInteger[11111111111111111111111].

    "Super-Mathematica" replied, that both of them are primes.

    So, both numbers consist of only one factor, themselves (or, if you want, two factors, themselves and 1). In other words, each is already factored.

    Since the largest prime in the primes list I made is 10,000,019, the code can make no further progress if it encounters a factor larger than (10,000,019)^2, or, 100,000,380,000,361.

    So, basically, if the smallest factor (the code finds the factors in the order of smallest to largest) in an integer is more than 14 digits, the code will respond that it cannot factor it at all.

    The code will factor any prime of less than 15 digits. In that case it will return a list containing one member, the prime itself.

    Last edited by danbaron; 01-05-2011 at 21:06.
    "You can't cheat an honest man. Never give a sucker an even break, or smarten up a chump." - W.C.Fields

  7. #7
    Member Johannes's Avatar
    Join Date
    Nov 2010
    Location
    Wuustwezel, Belgium
    Age
    56
    Posts
    95
    Rep Power
    25
    Quote Originally Posted by danbaron View Post
    There is no Nobel Prize in mathematics.
    The bastards.

    Come up with an algorithm to factor a composite integer in linear time and they'll create an award for the occasion.

    About storage of primes, there are very efficient ways.

    If you assume that 2 is prime then you can ignore all other even primes and you could use bits to indicate primality or not. You'd need one bit for every two natural numbers. If you also assume that 3 is prime then you can also ignore all multiples of 3 and you'd need one bit for every three natural numbers. Add 5 as a prime and the ratio becomes one bit for every 4 out of 15 numbers: 8 bits (also known as one "byte") per 30 numbers. (This third one is a standard method of compact prime storing.)

    For the primes up to 10 million you'd need 10^7/30 = 333,334 bytes = 325.52 kBytes (as compared to Racket's 6.25 MB).

    You can keep extending the efficiency by assuming the next prime number as a given. Without assuming anything the ratio is of course exactly 1. Assuming 2 is prime it becomes 1:2. Assuming 3 is prime it becomes 1:3 (or rather 2:6, you'll see why in a minute). Assuming 5 is prime it becomes 8:30. The logic behind it is if you assume the next prime p as given then the new ratio becomes the old ratio multiplied by n-1/n. So assuming 7 as prime changes the ratio from 8:30 to 8*(7-1):30*7 = 48:210 or six bytes for every 210 natural numbers. The next prime is 11 and assuming that one would change the ratio to 480:2310.

    Storing those primes up to 10 million with 2, 3, 5, 7 and 11 given as prime would now take 259,800 bytes or 253.71 kBytes, 77.94% of the earlier example.

    But you're right. At one point either storage or calculation time will become prohibitive. No matter how fast your computer is, at one point the next prime will take a million years to compute.
    Boole and Turing, help me!

    Primary programming: 200 MHz ARM StrongARM, RISC OS 4.02, BASIC V, ARM assembler.
    Secondary programming: 3.16 GHz Intel Core 2 Duo E8500, Vista Home Premium SP2, thinBasic, x86 assembler.

  8. #8
    thinBasic MVPs danbaron's Avatar
    Join Date
    Jan 2010
    Location
    California
    Posts
    1,378
    Rep Power
    152
    I believe you about the storage.

    I changed the code and REPL interactions (above) again.

    For some reason my mind thinks of things sequentially, not simultaneously.



    Last edited by danbaron; 02-05-2011 at 00:22.
    "You can't cheat an honest man. Never give a sucker an even break, or smarten up a chump." - W.C.Fields

  9. #9
    Member Johannes's Avatar
    Join Date
    Nov 2010
    Location
    Wuustwezel, Belgium
    Age
    56
    Posts
    95
    Rep Power
    25
    Quote Originally Posted by danbaron View Post
    It took approximately 16 hours to make the list as it is, with the first 664,580 primes.
    Six minutes flat. (A lie: it was 6m01s.) And not a single assembler command; it's all in standard thinBasic.

    I'm using the storage method as described above, with the first six primes assumed (2 through 13). This has the advantage that only 5760 out of 30,030 numbers have to be checked for primality, a speed gain of a factor of five. (I think you just skipped the even numbers, which is of course a speed gain of a factor of two.)

    At first the program would have taken fifty times (!) as long but I was able to speed it up by adding one single standard thinBasic command.
    DoEvents(Off)
    
    Unbelievable that the Windows events take up so much time.

    I'm still tweaking the prime generating program a little, and I'm also writing another program to be able to use that enormous primes file. My goal is to calculate all 32-bit primes...
    Boole and Turing, help me!

    Primary programming: 200 MHz ARM StrongARM, RISC OS 4.02, BASIC V, ARM assembler.
    Secondary programming: 3.16 GHz Intel Core 2 Duo E8500, Vista Home Premium SP2, thinBasic, x86 assembler.

  10. #10
    thinBasic MVPs danbaron's Avatar
    Join Date
    Jan 2010
    Location
    California
    Posts
    1,378
    Rep Power
    152
    I'll try to make a faster way too.



    Last edited by danbaron; 04-05-2011 at 11:17.
    "You can't cheat an honest man. Never give a sucker an even break, or smarten up a chump." - W.C.Fields

Page 1 of 3 123 LastLast

Similar Threads

  1. Haskell: factoring into primes
    By danbaron in forum Science
    Replies: 0
    Last Post: 12-06-2011, 08:47
  2. Factoring 2
    By danbaron in forum Science
    Replies: 0
    Last Post: 09-05-2011, 06:05

Members who have read this thread: 0

There are no members to list at the moment.

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •