https://github.com/michalc/sqlite-s3-query Skip to content Sign up * Why GitHub? Features - + Mobile - + Actions - + Codespaces - + Packages - + Security - + Code review - + Issues - + Integrations - + GitHub Sponsors - + Customer stories- * Team * Enterprise * Explore + Explore GitHub - Learn and contribute + Topics - + Collections - + Trending - + Learning Lab - + Open source guides - Connect with others + The ReadME Project - + Events - + Community forum - + GitHub Education - + GitHub Stars program - * Marketplace * Pricing Plans - + Compare plans - + Contact Sales - + Education - [ ] * # In this repository All GitHub | Jump to | * No suggested jump to results * # In this repository All GitHub | Jump to | * # In this user All GitHub | Jump to | * # In this repository All GitHub | Jump to | Sign in Sign up {{ message }} michalc / sqlite-s3-query * Notifications * Star 4 * Fork 0 Python function to query SQLite files stored on S3 MIT License 4 stars 0 forks Star Notifications * Code * Issues 0 * Pull requests 0 * Actions * Projects 0 * Wiki * Security * Insights More * Code * Issues * Pull requests * Actions * Projects * Wiki * Security * Insights main Switch branches/tags [ ] Branches Tags Could not load branches Nothing to show {{ refName }} default View all branches Could not load tags Nothing to show {{ refName }} default View all tags 1 branch 3 tags Code * Clone HTTPS GitHub CLI [https://github.com/m] Use Git or checkout with SVN using the web URL. [gh repo clone michal] Work fast with our official CLI. Learn more. * Open with GitHub Desktop * Download ZIP Launching GitHub Desktop If nothing happens, download GitHub Desktop and try again. Go back Launching GitHub Desktop If nothing happens, download GitHub Desktop and try again. Go back Launching Xcode If nothing happens, download Xcode and try again. Go back Launching Visual Studio Code Your codespace will open once ready. There was a problem preparing your codespace, please try again. Latest commit @michalc michalc build(release): v0.0.2 ... 6fb5316 Aug 14, 2021 build(release): v0.0.2 6fb5316 Git stats * 31 commits Files Permalink Failed to load latest commit information. Type Name Latest commit message Commit time .circleci build: CircleCI tests Aug 14, 2021 .gitignore Initial commit Aug 14, 2021 LICENSE Initial commit Aug 14, 2021 README.md docs: better detail on range requests Aug 14, 2021 setup.py build(release): v0.0.2 Aug 14, 2021 sqlite_s3_query.py feat: yield results as named tuple Aug 14, 2021 start-services.sh feat: initial behaviour Aug 14, 2021 stop-services.sh feat: initial behaviour Aug 14, 2021 test.py tests: test partial application works Aug 14, 2021 View code sqlite-s3-query Installation Usage README.md sqlite-s3-query CircleCI Test Coverage Python function to query a SQLite file stored on S3. It uses multiple HTTP range requests per query to avoid downloading the entire file, and so is suitable for large databases. All the HTTP requests for a query request the same version of the database object in S3, so queries should complete succesfully even if the database is replaced concurrently by another S3 client. Versioning must be enabled on the S3 bucket. Operations that write to the database are not supported. Installation sqlite-s3-query depends on APSW, which is not available on PyPI, but can be installed directly from GitHub. pip install sqlite_s3_query pip install https://github.com/rogerbinns/apsw/releases/download/3.36.0-r1/apsw-3.36.0-r1.zip --global-option=fetch --global-option=--version --global-option=3.36.0 --global-option=--all --global-option=build --global-option=--enable-all-extensions Usage from sqlite_s3_query import sqlite_s3_query results_iter = sqlite_s3_query( 'SELECT * FROM my_table WHERE my_column = ?', params=('my-value',), url='https://my-bucket.s3.eu-west-2.amazonaws.com/my-db.sqlite', ) for row in results_iter: print(row) If in your project you use multiple queries to the same file, functools.partial can be used to make an interface with less duplication. from functools import partial from sqlite_s3_query import sqlite_s3_query query_my_db = partial(sqlite_s3_query, url='https://my-bucket.s3.eu-west-2.amazonaws.com/my-db.sqlite', ) for row in query_my_db('SELECT * FROM my_table WHERE my_col = ?', params=('my-value',)): print(row) for row in query_my_db('SELECT * FROM my_table_2 WHERE my_col = ?', params=('my-value',)): print(row) The AWS region and the credentials are taken from environment variables, but this can be changed using the get_credentials parameter. Below shows the default implementation of this that can be overriden. import os from functools import partial from sqlite_s3_query import sqlite_s3_query query_my_db = partial(sqlite_s3_query url='https://my-bucket.s3.eu-west-2.amazonaws.com/my-db.sqlite', get_credentials=lambda: ( os.environ['AWS_DEFAULT_REGION'], os.environ['AWS_ACCESS_KEY_ID'], os.environ['AWS_SECRET_ACCESS_KEY'], os.environ.get('AWS_SESSION_TOKEN'), # Only needed for temporary credentials ), ) for row in query_my_db('SELECT * FROM my_table_2 WHERE my_col = ?', params=('my-value',)): print(row) About Python function to query SQLite files stored on S3 Resources Readme License MIT License Releases 3 tags Packages 0 No packages published Languages * Python 96.4% * Shell 3.6% * (c) 2021 GitHub, Inc. * Terms * Privacy * Security * Status * Docs * Contact GitHub * Pricing * API * Training * Blog * About You can't perform that action at this time. You signed in with another tab or window. Reload to refresh your session. You signed out in another tab or window. Reload to refresh your session.