ML to Predict Formulas:
Still in progress
Four labels:
- more: Structured: Predict equation from text formula. (important)
- less: Noise: Leave Nonsense. (won't equate)
- same: Label well trained: Existing equation. (stronger)
- unknown: Unstructured: Not sure (unstructured)
Features are words / formulas. Can only add a feature if you know how it applies to the other features.
- Good
- Bad
- Amount
- Frequency
- Distance
- Direction
- Time
- ...
First training formulas are obvious
import numpy as np
dataset = {}
dataset['target_name'] = np.array(['More', 'Less', 'Same', 'Unknown'])
# which means targets are identified by positions 0, 1, 2, 3
#Currently has 3 features, add more features when value is able to apply to other features
dataset['feature_name'] = np.array(['Good', 'Bad', 'Amount'])
dataset['feature_value'] = np.array([
[9999**9, -9999**9, 9999**9], # going to be target more thus 0
[-9999**9, 9999**9, 0], # going to be target less thus 1
[9999**9, 9999**9, 9999**9], # going to be target same thus 2
[1, -1, 1], # going to be target unknown thus 3
])
#So feature_values determined the following targets
dataset['target'] = np.array([0, 1, 2, 3])
################
#Now lets train a model
from sklearn.neighbors import KNeighborsClassifier
knn = KNeighborsClassifier(n_neighbors=1)
knn.fit(dataset['feature_value'], dataset['target'])
###############
#Now lets predict
#10 Good, 0 Bad, 1 Amount
what_is = np.array([[10, 0, 1]])
prediction = knn.predict(what_is)
print(dataset['target_name'][prediction])
prediction.predict_proba(what_is)
#Awnser is: Unknown
#Certainty: [[0. 0. 0. 1.]]