https://py-code.org/stats PyPI DataDownload PyPIStats for nerds Search PyPI projectsDatasets Repositories The contents of PyPI, in numbers Total files 1.03 Billion 78,001,202 unique Total lines of text 321.7 Billion 321,652,308,615 to be precise Total uncompressed size 55.0 TiB That is ~41,213,995.542 floppy disks Lines of code added per second 3,067 In the month 2023-08-01 Language Features This data only counts unique projects, not versions. e.g if a project has published 10 versions in a month, each containing an async function, it will only be counted once. Mature Features New Features Comprehensions Breakdown Name Projects Percent list comp 226,072 49 fstring 155,226 34 annotations 131,143 29 generator expression 121,229 26 dict comp 89,232 19 async 29,329 6 dataclasses 26,935 6 set comp 21,746 5 walrus 9,728 2 match 2,676 1 async comp 1,064 0 try star 22 0 Project Contents This data only counts unique projects, not versions. e.g if a project has published 10 versions in a month, each with a setup.py file, it will only be counted once. Setup.py vs PyProject.toml Markdown vs RST Other Files Typing? Show SQL Secrets Detected PyPI contains a lot of secrets. type count Google API Key 4,015 OpenAI API Key 3,531 Tencent Cloud Secret ID 1,855 Amazon AWS Secret Access Key 1,631 Amazon AWS Access Key ID 1,364 Google Cloud Private Key ID 1,080 Slack API Token 1,059 Telegram Bot Token 863 Slack Incoming Webhook URL 775 SendGrid API Key 748 Mailgun API Key 716 Mailchimp API Key 674 Stripe API Key 662 Twilio Account String Identifier 567 Alibaba Cloud AccessKey Secret 555 Total 25,479 Show All 94 Rows Growth Releases 8.92 million Size 60TB Files 1 billion PyPI is growing fast. If this dangerous expansion not stopped our advanced machine learning models predict that in only 8 years the number of packages will outnumber human beings. Witness this inevitable future Binary files This shows a breakdown of the binary files on PyPI, by extension. Binary files are the vast majority of the content on PyPI, accounting for nearly 75% of the uncompressed size. Show SQL extension total files total size unique files .so 6,243,810 19.3 TiB 3,496,663 .pyd 1,667,528 3.9 TiB 1,532,144 .dylib 1,023,802 2.6 TiB 342,433 .dll 1,241,808 1.9 TiB 366,405 No extension 4,720,109 1.8 TiB 1,660,310 .2 146,226 1.4 TiB 17,079 .0 569,665 1.2 TiB 64,736 .jar 392,763 809.7 GiB 41,668 .png 24,664,646 483.1 GiB 763,193 .1 277,139 434.3 GiB 38,905 .lib 109,096 405.6 GiB 31,724 .exe 189,591 379.3 GiB 42,593 .gz 4,120,142 374.4 GiB 538,656 .tgz 332,786 351.2 GiB 153,591 .7 36,242 304.2 GiB 2,478 Total 45,735,353 35.6 TiB 9,092,578 Show SQL Largest Projects by size Tensorflow dominates this list with 8.8 TiB of uncompressed data, 16% of all data on PyPI. project name unique total files total lines total files size tf-nightly 85,750 20,242,423 8,325,796,068 2.3 TiB tf-nightly-cpu 80,842 19,960,824 8,080,533,316 1.8 TiB tf-nightly-gpu 70,457 11,657,756 4,814,002,048 1.4 TiB lalsuite 1,719,153 9,969,505 4,542,135,454 1.1 TiB tensorflow 97,815 7,598,894 2,852,350,755 857.9 GiB paddlepaddle-gpu 30,859 2,000,670 433,161,447 856.7 GiB tensorflow-io-nightly 14,020 927,623 116,109,192 742.5 GiB tf-nightly-cpu-aws 46,547 7,864,827 3,031,860,974 638.9 GiB tensorflow-gpu 83,932 4,276,504 1,575,491,698 638.0 GiB catboost-dev 32,526 256,620 66,348,683 582.2 GiB tf-nightly-intel 77,131 7,263,510 2,958,885,573 488.4 GiB tensorflow-cpu 55,502 4,843,387 1,877,600,308 470.8 GiB OpenVisus 59,627 3,658,052 718,721,460 429.1 GiB tf-nightly-macos 22,808 4,487,368 2,145,554,908 415.9 GiB frida 35,297 185,908 19,342,825 402.1 GiB Total 2,512,266 105,193,871 41,557,894,709 13.0 TiB Show SQL Stats By Extensions This only considers the last suffix of the file path as the extension extension total files total lines total size unique files .py 447,209,916 117,091,120,776 4.3 TiB 30,428,193 .h 79,696,931 24,023,332,276 950.5 GiB 685,859 No extension 57,001,933 7,573,202,697 2.2 TiB 15,241,326 .json 51,352,075 20,516,757,230 1.0 TiB 1,549,579 .hpp 37,504,335 7,718,547,007 311.0 GiB 274,766 .txt 36,607,936 17,011,067,197 624.8 GiB 3,262,212 .js 30,544,896 12,366,504,866 1.0 TiB 1,288,083 .png 24,699,989 895,013 483.2 GiB 765,787 .rst 21,191,351 1,333,139,493 51.2 GiB 1,209,917 .svg 16,658,470 1,346,685,636 169.5 GiB 306,782 .pyi 16,063,875 3,128,935,978 103.4 GiB 452,920 .html 15,023,700 2,788,572,281 197.1 GiB 1,565,254 .yaml 11,317,560 1,067,089,460 38.4 GiB 323,115 .pyc 10,611,003 276,458 66.7 GiB 4,953,250 .yml 9,898,202 705,642,379 21.0 GiB 293,829 Total 865,382,172 216,671,768,747 11.5 TiB 62,600,872 Show SQL Files not committed to Github Not all files can be committed to GitHub due to size limits. Some have a few very, very long lines whilst others are junk like mistakenly added virtualenvs or VCS directories. This table shows a breakdown of the reasons why files where skipped. skip reason count unique max max lines total total lines total files size size projects empty 30,284,053 1 0 B 0 0 B 0 207,504 binary 81,600,252 15,761,098 1.9 1 40.8 126,101 105,920 GiB TiB virtualenv 9,122,068 428,879 2.1 71,126 85.8 2,596,702,054 5,609 MiB GiB too-large 6,444,173 734,612 1.4 20,010,001 7.3 133,426,037,163 30,795 GiB TiB text-long-lines 1,350,851 146,999 5.0 5 136.4 2,205,111 9,791 MiB GiB version-control 160,162 26,739 197.0 10,500 173.0 4,447,640 599 KiB MiB Total 128,961,559 17,098,328 3.3 20,091,633 48.3 136,029,518,069 360,218 GiB TiB Created by Tom Forbes (email contact) . Source code