AI Prompt Learning 01

AI Prompt Learning Goal: To demonstrate how prompt-driven discovery can be used to accelerate learning and objective prioritization. 

Our goal is to help you learn more, faster. 

AI Prompt Learning 01 Specific Goal: To focus on data acquisition, normalization, and ML Model prioritization for the "employee time sheet AI solution" example.

Start with Strategy

First, like anything, you need to understand what you're trying to achieve. You should try defining an outcome that your Prompt Learning is going to help you understand better such as:

"I want to use Generative AI to accelerate and simplify the way that my employees fill out their time sheets."

Then possibly ask yourself...

"Why is this a valuable thing to use Generative AI?"

You may arrive at something like...

"Well something like Open AI's ChatGPT is a really fast and simple workflow, it's low cost and high value."

But how valuable is the generated output if you have nothing to do with it, or no real strategy to follow the prompts up with more information to increase the fidelity of ChatGPT's outputs, and the knowledge you're gaining?

"Alright, for now I want to simply understand what the big computer hive mind has to say about automating something so obnoxiously sequential and mundane like filling out time sheets."

Ensure that you use a high quality, simple prompt that represents the big picture of what you're trying to achieve.

Prompt 01: How can I use Generative AI to automate the time sheet workflow I have for my employees?

Data Collection: Collect the historical data of your employee's time sheets including employee name, date, and hours worked. This data should be in a structured format to be analyzed by the AI model.
Preprocessing: This involves cleaning and formatting the data into a suitable format that can be processed by the AI model. This includes removing outliers and filling missing values.
Model Selection: Select a generative AI model that can generate time sheets based on the historical data. This can be achieved through Natural Language Processing (NLP) models such as OpenAI's GPT-3 or machine learning models such as Deep Learning or Recurrent Neural Networks.
Training: Train the selected model with the preprocessed data. The model should be trained to learn the patterns and relationships in the data to generate accurate time sheets.
Integration: Integrate the trained AI model with the existing time sheet workflow to automate the process. This can be achieved through an API or web service that can receive input data and generate time sheets.
Testing and Validation: Test the model with new data and validate its accuracy. Ensure that the generated time sheets are consistent with the historical data and the time sheet requirements.
Deployment: Once the model is tested and validated, deploy it in production to automate the time sheet workflow for your employees.

Using Generative AI, to automate the time sheet process, historical data of employees' time sheets need to be collected, cleaned, and formatted for analysis by an AI model. A suitable generative AI model, such as a Natural Language Processing or machine learning model, should be selected and trained on the preprocessed data to generate accurate time sheets. Then, the trained AI model is then integrated with the existing time sheet workflow and tested for accuracy and consistency with the historical data. This is really fun, right?

Okay, let's rock.

Now it's time to break down this beautiful generative maze of steps. Let's say you used a combination of an Microsoft Access Database application and spreadsheets to capture Time Sheet information for tracking and reporting purposes. You would want to narrow down then next prompt to the Technical Dependencies that you have for your business, while incorporating elements of "historic employee data"; essentially base it on Response 1 for Data Collection. Let's try breaking it into two different prompts for the same outcome.

Prompt 02: What steps can you take to centralize data that's in an Access Database and on Spreadsheets, so that you can begin to train a ML model with this data?

Extract data from Access Database: Export the data from the Access database into a format that is easy to work with, such as a CSV file.
Export data from Spreadsheets: Export data from the spreadsheets into a CSV file as well.
Combine data: Combine the data from Access and spreadsheets into one file. You can use tools like Pandas library in Python to merge CSV files.
Data Cleaning: Clean the combined data by removing duplicates, handling missing values, and resolving inconsistent data entries.
Feature Engineering: Feature engineering is the process of creating new features from existing ones to improve the performance of the machine learning model. Perform feature engineering tasks on the data to make it more useful and informative to the model.
Data preparation: Prepare the data for training the machine learning model. Split the data into training, validation, and testing sets.
Train the model: Choose a machine learning algorithm that best suits the problem, and train the model with the prepared data.
Evaluate the model: Evaluate the performance of the trained model using performance metrics such as accuracy, precision, recall, and F1 score.
Deploy the model: Finally, deploy the trained model in a production environment and use it to make predictions on new data.

Ohhh, we discovered a new fun flow!

A picture of three purple circles that say Extract, Explore, and Combine

Extract data from Access Database: Export the data from the Access database into a format that is easy to work with, such as a CSV file.
Export data from Spreadsheets: Export data from the spreadsheets into a CSV file as well.
Combine data: Combine the data from Access and spreadsheets into one file. You can use tools like Pandas library in Python to merge CSV files.

The first step to anything data related is "normalization". Your model's hungry, and it wants it's "food" structured in a digestible way. Combining the data to fit a common structure for the model's pointers to navigate and learn from is key.

Going back to Prompt 1; let's focus on the type of model we're going to be training, because that will inform how we structure our newly combined data.

This could lead to our next prompt! (Yay)

Prompt 03: What steps can you take to centralize data that's in an Access Database and on Spreadsheets, so that you can begin to train a ML model with this data?

There are several types of machine learning model architectures, each with its own strengths and weaknesses. Here are some of the most common ones:

Linear models: These models make predictions by computing a linear combination of the input features. Linear models are simple and efficient, making them useful for large datasets with many features. They are often used in regression tasks where the goal is to predict a continuous output variable.
Decision trees: These models make predictions by recursively partitioning the input space into smaller regions based on the values of the input features. Decision trees are intuitive and easy to interpret, making them useful for tasks that require transparency and explainability.
Neural networks: These models are inspired by the structure and function of the human brain. They consist of multiple layers of interconnected nodes, each performing a simple computation on the input data. Neural networks are highly expressive and can learn complex patterns in the data, making them useful for tasks like image and speech recognition.
Support vector machines (SVMs): These models find a hyperplane that separates the input data into different classes. SVMs are effective in high-dimensional spaces and can handle both linear and nonlinear classification tasks.
Ensemble models: These models combine multiple base models to improve prediction accuracy. Popular ensemble methods include bagging, boosting, and stacking.

Well, what did we discover?

Each model offers a different value proposition. Which one do you start with? Do you consider mapping out your model architecture that aims to create a model loop for deep automation leveraging various model types? Start small, understand your desired outcomes for timesheets, and filter your options down to a model that will immediately begin to support that problem.

Let's list some common time sheet painpoints:

Employees forget to report their hours
Employees forget to submit their hours
The SaaS platform used for tracking hours is unreliable
Inputting hours is a "box check" that is unreliable

Let's think of some potential solutions paths that could start to address these things:

Notification system to a device or multiple devices using the most dependable channel for each employee. Maybe they choose to make accountability much easier when they still continue to fail to report hours.
Same as number '1', but occurs much more frequently, and employees are probably notified daily rather than weekly.
I would recommend to diagnose the issues with the SaaS platform and run a widespread survey; if a crowd of folks surfaces the "cost" of dealing with a faulty SaaS platform, you may be able to generate enough influence to change that issue without building a custom technology solutions to resolve it.
There are really two paths here -- One is dystopian employee tracking, Two is creating a gamified system that is also associated with a salary. This would shake up current paradigms, it would almost be like a "gig hybrid" situation with W2 employees to possibly incentivize something like "remote behavior" and work ethic outside of a formal office setting.

Let's ignore numbers '1' and '2' because notifications are boring (just kidding), but for the sake of a fun journey, let's look at how we could resolve number '4'; how we could impact employees behavior and incentivize them to capture and input accurate labor hour information.

On the next AI Prompt Learning - AI Prompt Learning 02, we'll kick it off with a recap of this article, as well as "incentivization trees" to drive the features that could be developed for a future product that solves the problem we selected. Of course we'll lean heavily on ChatGPT research and prompt learning. We'll also dig into how we clean and enrich this data, to prepare it for training. We'll certainly focus on the training itself as well, followed by the expected performance and interactions of each model. Finally, a special treat in the next AI Prompt Learning article will be how we use design principles to increase the quality of data and the model that's being trained -- this section is going to be fun to discuss! We hope you enjoyed this article, please post it, spread the AI Prompt Learning love! Continue to follow our AI Prompt Learning series here.

AI Prompt Learning 01

Recent Posts

Let's Find a Way To Grow Together