Machine Learning in Porcurement
Intro
Spend analysis is necessary for reviewing procurement spend to decrease costs, increase efficiency or improve supplier relationships. ML model allows to recognize a spend type based on its specification.
Features
App includes following features:
Demo
Script perfoms following steps:
- Loading labeled sets of features into memory.
- Splitting data into training and testing subsets:
- X_train - features for the model training,
- y_train - corresponding labels for the model training,
- X_test - features for the model testing ,
- y_test - corresponding labels for the model testing.
- Setting pipeline of voctorizing words and applying LinearSVC algorithm:
- We need to apply word vectorization when working with word features.
- To do so, we put all the words from all the labeled samples into one bag - bag of words.
- Then we check each sample's word features against the bag of words.
- While we checking, we create sub-list of smaple's word features which length eguals number of words in current smaple.
- Checking depends on counting sample's word occurance.
- Replacing words with numbers so that machine can compute it.
- The more frequently a word apperas, the bigger value it gets.
- Outcome of vectorization is a list with the numbers where each item is a signle word. - Training ML model using subsets X_train and y_train.
- Using trained model to predict labels for test features from X_test.
- We can assess model's accuracy by comparing predictons with respective test labels from y_test.
- Accuracy can be measured as divison: predictions / y_test.
- When it's highet than 80% then for my needs model is reliable. - Once I find model reliable, I can put new features that model has not seen.
- Trained model gives labels for new features.
Input:
- Here is the structure of data input.
- First column is the column of labels.
- Second column is the column of features.
- Each feature has a label assigned.
- Spend specifications are taken as features in ML model training.
- Team's Clusters (f.e.: Information Technologies, Logistic) are taken as labels in ML model training.
- Accuracy for the model equals 80% which for our purpose is totally enough.
- 80% accuracy means that 80% of predicted labels for testing features are in line with testing lables.
- Application: we can use the ML model to recoginze Team's Cluster based on given spend specification assigned to spend.
- This recognition enables us to distribute workload among partucular teams.
- Here we can see both the overall accuracy and the accuracy for a particular team.
Output:
- Above we can see the final output where ML model assigned labels to new features which have been never introduced to model.
- New features:
- Box file storage
- Data archiving on server
- Building enginnering construction
- Automation business process
- Computer standing workstation - Script saves output in separate fie in current project directory.
Setup
Python libraries installation required.
pip install sklearn
pip install pandas