https://github.com/blobcity/autoai Skip to content Sign up * Why GitHub? + Features - + Mobile - + Actions - + Codespaces - + Packages - + Security - + Code review - + Issues - + Integrations - + GitHub Sponsors - + Customer stories - * Team * Enterprise * Explore + Explore GitHub - + Learn and contribute + Topics - + Collections - + Trending - + Learning Lab - + Open source guides - + Connect with others + The ReadME Project - + Events - + Community forum - + GitHub Education - + GitHub Stars program - * Marketplace * Pricing + Plans - + Compare plans - + Contact Sales - + Education - [ ] * # In this repository All GitHub | Jump to | * No suggested jump to results * # In this repository All GitHub | Jump to | * # In this organization All GitHub | Jump to | * # In this repository All GitHub | Jump to | Sign in Sign up {{ message }} blobcity / autoai Public * Notifications * Star 19 * Fork 31 * Python based framework for Automatic AI for Regression and Classification over numerical data. Performs model search, hyper-parameter tuning, and high-quality Jupyter Notebook code generation. Apache-2.0 License 19 stars 31 forks Star Notifications * Code * Issues 8 * Pull requests 2 * Discussions * Actions * Projects 0 * Wiki * Security * Insights More * Code * Issues * Pull requests * Discussions * Actions * Projects * Wiki * Security * Insights main Switch branches/tags [ ] Branches Tags Could not load branches Nothing to show Loading {{ refName }} default View all branches Could not load tags Nothing to show {{ refName }} default Loading View all tags 1 branch 1 tag Code Loading Latest commit @sanketsarang sanketsarang Update README.md ... c26a72c Nov 11, 2021 Update README.md c26a72c Git stats * 354 commits Files Permalink Failed to load latest commit information. Type Name Latest commit message Commit time .github/workflows Create codeql-analysis.yml Oct 2, 2021 blobcity Minor fix and Support Nov 3, 2021 docs Update index.md Nov 10, 2021 yml minor fixes Sep 28, 2021 .gitignore First cut design reference Aug 31, 2021 CONTRIBUTING.md Added setup,pyproject and contributing.md update Sep 27, 2021 ClassificationTest.py Minor fix and Enhancements Oct 27, 2021 LICENSE Initial commit Aug 31, 2021 README.md Update README.md Nov 12, 2021 RegressionTest.py Minor fix and Enhancements Oct 27, 2021 pyproject.toml Added setup,pyproject and contributing.md update Sep 27, 2021 requirements.txt Minor fix and Enhancements Oct 27, 2021 sample_use.py minor fix Sep 29, 2021 setup.cfg Minor changes Oct 28, 2021 View code [ ] BlobCity AutoAI Getting Started Pre-processing Feature Selection Model Search, Train & Hyper-parameter Tuning Code Generation Predictions Stats & Accuracy Actual v/s Predicted Plot (for Regression) Confusion Matrix (for Classification) Numercial Stats Persistence Accelerated Training Features and Roadmap README.md [6874747073] PyPI version Downloads Python License Contributors Commit Activity Last Commit Slack GitHub Stars Twitter BlobCity AutoAI A framework to find the best performing AI/ML model for any AI problem. Works for Classification and Regression type of problems on numerical data. AutoAI makes AI easy and accessible to everyone. It not only trains the best-performing model but also exports high-quality code for using the trained model. The framework is currently in beta release, with active development being still in progress. Please report any issues you encounter. Issues Getting Started pip install blobcity import blobcity as bc model = bc.train(file="data.csv", target="Y_column") model.spill("my_code.py") Y_column is the name of the target column. The column must be present within the data provided. Automatic inference of Regression / Classification is supported by the framework. Data input formats supported include: 1. Local CSV / XLSX file 2. URL to a CSV / XLSX file 3. Pandas DataFrame model = bc.train(file="data.csv", target="Y_column") #local file model = bc.train(file="https://example.com/data.csv", target="Y_column") #url model = bc.train(df=my_df, target="Y_column") #DataFrame Pre-processing The framework has built-in support for several data pre-processing techniques, such as imputing missing values, column encoding, and data scaling. Pre-processing is carried out automatically on train data. The predict function carries out the same pre-processing on new data. The user is not required to be concerned with the pre-processing choices of the framework. One can view the pre-processing methods used on the data by exporting the entire model configuration to a YAML file. Check the section below on "Exporting to YAML." Feature Selection model.features() #prints the features selected by the model ['Present_Price', 'Vehicle_Age', 'Fuel_Type_CNG', 'Fuel_Type_Diesel', 'Fuel_Type_Petrol', 'Seller_Type_Dealer', 'Seller_Type_Individual', 'Transmission_Automatic', 'Transmission_Manual'] AutoAI automatically performs a feature selection on input data. All features (except target) are potential candidates for the X input. AutoAI will automatically remove ID / Primary-key columns. This does not guarantee that all specified features will be used in the final model. The framework will perform an automated feature selection from amongst these features. This only guarantees that other features if present in the data will not be considered. AutoAI ignores features that have a low importance to the effective output. The feature importance plot can be viewed. model.plot_feature_importance() #shows a feature importance graph Feature Importance Plot There might be scenarios where you want to explicitely exclude some columns, or only use a subset of columns in the training. Manually specify the features to be used. AutoAI will still perform a feature selection within the list of features provided to improve effective model accuracy. model = bc.train(file="data.csv", target="Y_value", features=["col1", "col2", "col3"]) Model Search, Train & Hyper-parameter Tuning Model search, train and hyper-parameter tuning is fully automatic. It is a 3 step process that tests your data across various AI/ML models. It finds models with high success tendency, and performs a hyper-parameter tuning to find you the best possible result. Regression Models Library Classification Models Library Code Generation High-quality code generation is why most Data Scientists choose AutoAI. The spill function generates the model code with exhaustive documentation. scikit-learn models export with training code included. TensorFlow and other DNN models produce only the test / final use code. AutoAI Generated Code Example Code generation is supported in ipynb and py file formats, with options to enable or disable detailed documentation exports. model.spill("my_code.ipynb"); #produces Jupyter Notebook file with full markdown docs model.spill("my_code.py") #produces python code with minimal docs model.spill("my_code.py", docs=True) #python code with full docs model.spill("my_code.ipynb", docs=False) #Notebook file with minimal markdown Predictions Use a trained model to generate predictions on new data. prediction = model.predict(file="unseen_data.csv") All required features must be present in the unseen_data.csv file. Consider checking the results of the automatic feature selection to know the list of features needed by the predict function. Stats & Accuracy model.plot_prediction() The function is shared across Regression and Classification problems. It plots a relevant chart to assess efficiency of training. Actual v/s Predicted Plot (for Regression) Actual v/s Predicted Plot Plotting only first 100 rows. You can specify -100 to plot last 100 rows. model.plot_prediction(100) Actual v/s Predicted Plot first 100 Confusion Matrix (for Classification) model.plot_prediction() AutoAI Generated Code Example Numercial Stats model.stats() Print the key model parameters, such as Precision, Recall, F1-Score. The parameters change based on the type of AutoAI problem. Persistence model.save('./my_model.pkl') model = bc.load('./my_model.pkl') You can save a trained model, and load it in the future to generate predictions. Accelerated Training Leverage BlobCity AI Cloud for fast training on large datasets. Reasonable cloud infrastructure included for free. BlobCity AI Cloud CPU GPU Features and Roadmap * [*] Numercial data Classification and Regression * [*] Automatic feature selection * [*] Code generation * [ ] Neural Networks & Deep Learning * [ ] Image classification * [ ] Optical Character Recognition (english only) * [ ] Video tagging with YOLO * [ ] Generative AI using GAN About Python based framework for Automatic AI for Regression and Classification over numerical data. Performs model search, hyper-parameter tuning, and high-quality Jupyter Notebook code generation. Topics python machine-learning ai deep-learning ml codegen automl autoai Resources Readme License Apache-2.0 License Releases 1 tags Contributors 20 * @Thilakraj1998 * @sanketsarang * @naresh1205 * @aadityasinha-dotcom * @SaharshLaud * @bhumikaBisht * @Devolta05 * @26tanishabanik * @sreyan-ghosh * @melan96 * @Tanuj2552 + 9 contributors Languages * Python 100.0% * (c) 2021 GitHub, Inc. * Terms * Privacy * Security * Status * Docs * Contact GitHub * Pricing * API * Training * Blog * About You can't perform that action at this time. You signed in with another tab or window. Reload to refresh your session. You signed out in another tab or window. Reload to refresh your session.