https://github.com/blobcity/autoai

Skip to content
 
Sign up

  * Why GitHub?
      + Features -
      + Mobile -
      + Actions -
      + Codespaces -
      + Packages -
      + Security -
      + Code review -
      + Issues -
      + Integrations -
      + GitHub Sponsors -
      + Customer stories -
  * Team
  * Enterprise
  * Explore
      + Explore GitHub -
      + Learn and contribute
      + Topics -
      + Collections -
      + Trending -
      + Learning Lab -
      + Open source guides -
      + Connect with others
      + The ReadME Project -
      + Events -
      + Community forum -
      + GitHub Education -
      + GitHub Stars program -
  * Marketplace
  * Pricing
      + Plans -
      + Compare plans -
      + Contact Sales -
      + Education -

[                    ] 

  *  
    #
    In this repository All GitHub |
    Jump to |

  * No suggested jump to results

  *  
    #
    In this repository All GitHub |
    Jump to |
  *  
    #
    In this organization All GitHub |
    Jump to |
  *  
    #
    In this repository All GitHub |
    Jump to |

Sign in
Sign up
{{ message }}

blobcity / autoai Public

  * Notifications
  * Star 19
  * Fork 31
  * 

Python based framework for Automatic AI for Regression and
Classification over numerical data. Performs model search,
hyper-parameter tuning, and high-quality Jupyter Notebook code
generation.

Apache-2.0 License
19 stars 31 forks
Star
Notifications

  * Code
  * Issues 8
  * Pull requests 2
  * Discussions
  * Actions
  * Projects 0
  * Wiki
  * Security
  * Insights

More

  * Code
  * Issues
  * Pull requests
  * Discussions
  * Actions
  * Projects
  * Wiki
  * Security
  * Insights

main
Switch branches/tags
[                    ]
Branches Tags
Could not load branches
Nothing to show
Loading
{{ refName }} default View all branches
Could not load tags
Nothing to show
{{ refName }} default
Loading
View all tags
1 branch 1 tag
Code
Loading

Latest commit

@sanketsarang
sanketsarang Update README.md
...
c26a72c Nov 11, 2021
Update README.md
c26a72c

Git stats

  * 354 commits

Files

Permalink
Failed to load latest commit information.
Type
Name
Latest commit message
Commit time
.github/workflows
Create codeql-analysis.yml
Oct 2, 2021
blobcity
Minor fix and Support
Nov 3, 2021
docs
Update index.md
Nov 10, 2021
yml
minor fixes
Sep 28, 2021
.gitignore
First cut design reference
Aug 31, 2021
CONTRIBUTING.md
Added setup,pyproject and contributing.md update
Sep 27, 2021
ClassificationTest.py
Minor fix and Enhancements
Oct 27, 2021
LICENSE
Initial commit
Aug 31, 2021
README.md
Update README.md
Nov 12, 2021
RegressionTest.py
Minor fix and Enhancements
Oct 27, 2021
pyproject.toml
Added setup,pyproject and contributing.md update
Sep 27, 2021
requirements.txt
Minor fix and Enhancements
Oct 27, 2021
sample_use.py
minor fix
Sep 29, 2021
setup.cfg
Minor changes
Oct 28, 2021
View code
[                    ]
BlobCity AutoAI Getting Started Pre-processing Feature Selection
Model Search, Train & Hyper-parameter Tuning Code Generation
Predictions Stats & Accuracy Actual v/s Predicted Plot (for
Regression) Confusion Matrix (for Classification) Numercial Stats
Persistence Accelerated Training Features and Roadmap

README.md

[6874747073]

PyPI version Downloads Python License

Contributors Commit Activity Last Commit Slack

GitHub Stars Twitter

 BlobCity AutoAI

A framework to find the best performing AI/ML model for any AI
problem. Works for Classification and Regression type of problems on
numerical data. AutoAI makes AI easy and accessible to everyone. It
not only trains the best-performing model but also exports
high-quality code for using the trained model.

The framework is currently in beta release, with active development
being still in progress. Please report any issues you encounter.

Issues

 Getting Started

pip install blobcity

import blobcity as bc
model = bc.train(file="data.csv", target="Y_column")
model.spill("my_code.py")

Y_column is the name of the target column. The column must be present
within the data provided.

Automatic inference of Regression / Classification is supported by
the framework.

Data input formats supported include:

 1. Local CSV / XLSX file
 2. URL to a CSV / XLSX file
 3. Pandas DataFrame

model = bc.train(file="data.csv", target="Y_column") #local file

model = bc.train(file="https://example.com/data.csv", target="Y_column") #url

model = bc.train(df=my_df, target="Y_column") #DataFrame

 Pre-processing

The framework has built-in support for several data pre-processing
techniques, such as imputing missing values, column encoding, and
data scaling.

Pre-processing is carried out automatically on train data. The
predict function carries out the same pre-processing on new data. The
user is not required to be concerned with the pre-processing choices
of the framework.

One can view the pre-processing methods used on the data by exporting
the entire model configuration to a YAML file. Check the section
below on "Exporting to YAML."

 Feature Selection

model.features() #prints the features selected by the model

['Present_Price',
 'Vehicle_Age',
 'Fuel_Type_CNG',
 'Fuel_Type_Diesel',
 'Fuel_Type_Petrol',
 'Seller_Type_Dealer',
 'Seller_Type_Individual',
 'Transmission_Automatic',
 'Transmission_Manual']

AutoAI automatically performs a feature selection on input data. All
features (except target) are potential candidates for the X input.

AutoAI will automatically remove ID / Primary-key columns.

This does not guarantee that all specified features will be used in
the final model. The framework will perform an automated feature
selection from amongst these features. This only guarantees that
other features if present in the data will not be considered.

AutoAI ignores features that have a low importance to the effective
output. The feature importance plot can be viewed.

model.plot_feature_importance() #shows a feature importance graph

Feature Importance Plot

There might be scenarios where you want to explicitely exclude some
columns, or only use a subset of columns in the training. Manually
specify the features to be used. AutoAI will still perform a feature
selection within the list of features provided to improve effective
model accuracy.

model = bc.train(file="data.csv", target="Y_value", features=["col1", "col2", "col3"])

 Model Search, Train & Hyper-parameter Tuning

Model search, train and hyper-parameter tuning is fully automatic. It
is a 3 step process that tests your data across various AI/ML models.
It finds models with high success tendency, and performs a
hyper-parameter tuning to find you the best possible result.

Regression Models Library

Classification Models Library

 Code Generation

High-quality code generation is why most Data Scientists choose
AutoAI. The spill function generates the model code with exhaustive
documentation. scikit-learn models export with training code
included. TensorFlow and other DNN models produce only the test /
final use code.

AutoAI Generated Code Example

Code generation is supported in ipynb and py file formats, with
options to enable or disable detailed documentation exports.

model.spill("my_code.ipynb"); #produces Jupyter Notebook file with full markdown docs

model.spill("my_code.py") #produces python code with minimal docs

model.spill("my_code.py", docs=True) #python code with full docs

model.spill("my_code.ipynb", docs=False) #Notebook file with minimal markdown

 Predictions

Use a trained model to generate predictions on new data.

prediction = model.predict(file="unseen_data.csv")

All required features must be present in the unseen_data.csv file.
Consider checking the results of the automatic feature selection to
know the list of features needed by the predict function.

 Stats & Accuracy

model.plot_prediction()

The function is shared across Regression and Classification problems.
It plots a relevant chart to assess efficiency of training.

 Actual v/s Predicted Plot (for Regression)

Actual v/s Predicted Plot

Plotting only first 100 rows. You can specify -100 to plot last 100
rows.

model.plot_prediction(100)

Actual v/s Predicted Plot first 100

 Confusion Matrix (for Classification)

model.plot_prediction()

AutoAI Generated Code Example

 Numercial Stats

model.stats()

Print the key model parameters, such as Precision, Recall, F1-Score.
The parameters change based on the type of AutoAI problem.

 Persistence

model.save('./my_model.pkl')

model = bc.load('./my_model.pkl')

You can save a trained model, and load it in the future to generate
predictions.

 Accelerated Training

Leverage BlobCity AI Cloud for fast training on large datasets.
Reasonable cloud infrastructure included for free.

BlobCity AI Cloud CPU GPU

 Features and Roadmap

  * [*] Numercial data Classification and Regression
  * [*] Automatic feature selection
  * [*] Code generation
  * [ ] Neural Networks & Deep Learning
  * [ ] Image classification
  * [ ] Optical Character Recognition (english only)
  * [ ] Video tagging with YOLO
  * [ ] Generative AI using GAN

About

Python based framework for Automatic AI for Regression and
Classification over numerical data. Performs model search,
hyper-parameter tuning, and high-quality Jupyter Notebook code
generation.

Topics

python machine-learning ai deep-learning ml codegen automl autoai

Resources

Readme

License

Apache-2.0 License

Releases

1 tags

Contributors 20

  * @Thilakraj1998
  * @sanketsarang
  * @naresh1205
  * @aadityasinha-dotcom
  * @SaharshLaud
  * @bhumikaBisht
  * @Devolta05
  * @26tanishabanik
  * @sreyan-ghosh
  * @melan96
  * @Tanuj2552

+ 9 contributors

Languages

  * Python 100.0%

  * (c) 2021 GitHub, Inc.
  * Terms
  * Privacy
  * Security
  * Status
  * Docs

 

  * Contact GitHub
  * Pricing
  * API
  * Training
  * Blog
  * About

You can't perform that action at this time.
You signed in with another tab or window. Reload to refresh your
session. You signed out in another tab or window. Reload to refresh
your session.