Skip to main content

Simplifying machine learning

Project description

1. ABOUT PROJECT


  • mllibs is a Machine Learning (ML) library which utilises natural language processing (NLP)
  • Development of such helper modules are motivated by the fact that everyones understanding of coding & subject matter (ML in this case) may be different
  • Often we see people create functions and classes to simplify the process of achieving something (which is good practice)
  • Likewise, NLP interpreters follow this trend as well, except, in this case our only inputs for activating certain code is natural language
  • Using python, we can interpret natural language in the form of string type data, using natural langauge interpreters

2. CODE AUTOMATION


I'm sure most people are familiar with code automation:

(1) Let's take a look at how we would simplify our code using functions

def fib_list(n):
    result = []
    a,b = 0,1
    while a<n:
        result.append(a)
        a,b = b, a + b
    return result

fib_list(5) # [0, 1, 1, 2, 3]

(2) Let's take a look at how we wold simplify our code using a class structure:

class fib_list:
    
    def __init__(self,n):
        self.n = n

    def get_list(self):
        result = []
        a,b = 0,1
        while a<self.n:
            result.append(a)
            a,b = b, a + b
        return result

fib = fib_list(5)
fib.get_list() # [0, 1, 1, 2, 3]

Such approaches presume we have coding knowledge, our next approach doesn't require such knowledge

(3) Let's take a look how we could simplify this using language

input = 'calculate the fibonacci sequence for the value of 5'
nlp_interpreter(input) # [0, 1, 1, 2, 3]

3. LETS LOOK AT AN EXAMPLE USING MLLIBS


Let's check out one example of how it works; let's visualise some data by requesting:

Our cell output will be:


4. WHY THIS LIBRARY EXISTS


A good question to ask ourselves is why would this be needed?

Here are some anwsers:

  • Not everyone level of programming is the same, someone might struggle, whilst others know it quite well
  • The same goes for the topic 'Machine Learning', there are quite a few concepts to remember

4. Package aims to provide:

  • A userfiendly way introduction to Machine Learning for someone new to the field that have little knowledge of programming

5. PROJECT STATUS


  • mllibs is usable, but still very raw, I'm constantly trying way to improve and clean the code structure
  • If you would like to try it out, you can simply fork the notebook nlp module for mllibs

6. LIBRARY COMPONENTS


mllibs consists of two parts:

(1) modules associated with the interpreter

  • nlpm - groups together everything required for the interpreter module nlpi
  • nlpi - main interpreter component module (requires nlpm instance)
  • snlpi - single request interpreter module (uses nlpi)
  • mnlpi - multiple request interpreter module (uses nlpi)
  • interface - interactive module (chat type)

(2) custom added modules, for mllibs these library are associated with machine learning

You can check all the activations functions using session.fl() as shown in the sample notebooks


7. MODULE COMPONENT STRUCTURE


Currently new modules can be added using a custom class sample and a configuration dictionary configure_sample

# sample module class structure
class sample(nlpi):
    
    # called in nlpm
    def __init__(self,nlp_config):
        self.name = 'sample'             # unique module name identifier (used in nlpm/nlpi)
        self.nlp_config = nlp_config  # text based info related to module (used in nlpm/nlpi)
        
    # called in nlpi
    def sel(self,args:dict):
        
        self.select = args['pred_task']
        self.args = args
        
        if(self.select == 'function'):
            self.function(self.args)
        
    # use standard or static methods
        
    def function(self,args:dict):
        pass
        
    @staticmethod
    def function(args:dict):
        pass
    

corpus_sample = OrderedDict({"function":['task']}
info_sample = {'function': {'module':'sample',
                            'action':'action',
                            'topic':'topic',
                            'subtopic':'sub topic',
                            'input_format':'input format for data',
                            'output_format':'output format for data',
                            'description':'write description'}}
                         
# configuration dictionary (passed in nlpm)
configure_sample = {'corpus':corpus_sample,'info':info_sample}

8. CREATING A COLLECTION

There are two ways to start an interpreter session, manually importing and grouping modules or using interface class


First we need to combine all our module components together, this will link all passed modules together

collection = nlpm()
collection.load([loader(configure_loader),
                 simple_eda(configure_eda),
                 encoder(configure_nlpencoder),
                 embedding(configure_nlpembed),
                 cleantext(configure_nlptxtclean),
                 sklinear(configure_sklinear),
                 hf_pipeline(configure_hfpipe),
                 eda_plot(configure_edaplt)])
                 

Then we need to train interpreter models

collection.train()

Lastly, pass the collection of modules (nlpm instance) to the interpreter nlpi

session = nlpi(collection)

class nlpi can be used with method exec for user input interpretation

session.exec('create a scatterplot using data with x dimension1 y dimension2')

The faster way, includes all loaded modules and groups them together for us:

from mllibs.interface import interface
session = interface()

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distributions

No source distribution files available for this release.See tutorial on generating distribution archives.

Built Distribution

mllibs-0.1.2-py2.py3-none-any.whl (90.5 kB view hashes)

Uploaded Python 2 Python 3

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page