Using Machine Learning to Create an Auto Repair Estimate

Using new machine learning techniques we can use an LLM to automatically estimate car repair costs

Machine Learning

Gabriel Valerio & Adam Smith

February 21, 2024

•

9 min read

The Tool

Let’s say you want to build an app that enables auto mechanics to inspect a vehicle, verbally dictate their findings, and automatically generate a detailed repair estimate for the customer. This estimate includes pricing for parts and services, which the mechanic can review and adjust as necessary before sharing it with the customer. This app, and many similar use cases, can easily be achieved by using a practical combination of speech-to-text and a generative AI model. Let’s break down how we would go about building this.

The Technology

The framework of our app is built with standard React Native components, enabling it to operate across iOS and Android devices. It includes a dictation function that records the mechanic’s speech and saves it to an audio file (this is fairly trivial so we’ll skip the explanation of this part).

At the core of our speech transcription process is OpenAI's Whisper, a state-of-the-art model renowned for its exceptional speech-to-text conversion capabilities. Whisper excels in accurately interpreting the mechanic's spoken words, even amidst the backdrop of the bustling noise typical of the auto body shop. This level of accuracy is pivotal in capturing the intricate details of vehicle inspections, ensuring that every observation is accurately documented.

To elevate these transcriptions into actionable auto repair estimates, we harness the power of OpenAI's GPT-4, augmented with function calling capabilities. OpenAI’s function calling is the capability to take functions as input and generate structured results. Our system is backed by a database of services, parts, and their corresponding prices from the shop.

To get the transcription from an audio file using OpenAI's Whisper is as easy as follows:

import whisper
model = whisper.load_model("base")
def transcribe(audio_file_path):
 result = model.transcribe(audio_file_path)
 return result["text"]

When GPT-4 receives the transcription, it doesn't just read the text—it understands it. It delves into the details, distinguishing between different types of repairs and maintenance tasks. By analyzing the context and specifics of the mechanic's notes, GPT-4 can intelligently generate a JSON payload that we then use to create a tailored estimate. The estimate specifies the necessary parts and services, complete with associated costs, directly derived from the mechanic's expert assessment.

In the following code snippet, we show how to call OpenAI’s GPT-4 with a temperature of 0.1. This indicates we have a model that is not too creative, making it easier to follow our instructions of the initial prompt:

import openai
import os
from dotenv import load_dotenv

load_dotenv()

openai.api_key = os.getenv("OPENAI_API_KEY")

def build_messages(prompt, first_message):
 return [
     {
       "role": "system",
       "content": prompt,
     },
     {
       "role": "user",
       "content": first_message,
     }
 ]

def call_llm(prompt, first_message):
 response = openai.ChatCompletion.create(
   model="gpt-4",
   messages=build_messages(prompt, first_message),
   temperature=0.1
 )
 response_message = response["choices"][0]["message"]
 return response_message

The initial prompt, i.e., the initial set of instructions that will guide the chat completion and consequently achieving our desired result:

initial_prompt = """
 Based on the following list of auto repair services and/or products, identify in the transcription all the repairs needed and the quantities needed of each. Answer only with the expected json array, no text or root object! Double check if you are including all of the repairs! Think it through, calmly, if you do it successfully, I will tip you $100!


 Response Format (json):
 [
   {{"id": "service_id", "quantity": 1, "reason": "reason why this service is needed"}}
 ]


 Example:
Transcription:
	Starting with the exterior, the car's bodywork is in prime condition. Moving under the hood, the engine is running smoothly with no signs of trouble. However, the brake system requires some attention, likely needing new pads and possibly rotors. The transmission is operating flawlessly, showing no signs of wear or need for repair. Tires are in good shape, with plenty of tread life left, but alignment is slightly off, necessitating an adjustment. The interior is well-kept, with upholstery and dashboard looking like new. The air conditioning system, though, is not cooling efficiently and might need a recharge or a more in-depth look at the compressor. All lights, both interior and exterior, are functioning properly. Lastly, the suspension system is robust, providing a smooth ride without any noticeable issues.
Result:
	[
  {"id": "brake_repair", "quantity": 1, "reason": "Brake pads and possibly rotors need replacing."},
  {"id": "wheel_alignment", "quantity": 1, "reason": "Alignment is slightly off, requiring adjustment."},
  {"id": "ac_service", "quantity": 1, "reason": "Air conditioning not cooling efficiently, may need recharge or compressor inspection."}
]


 Services Available:
 {list_of_services}
 """

In the first message sent after the initial prompt, we feed the model with the user transcription so it returns us an JSON with all the services identified on the text, that can consequently be used for automatically calculating the Bill of Services.

Best Practices for Function Calling Prompts

Getting the best results from GPT-4 function calling is somewhat of an art. In our experience these have worked well:

Offer a tip: We don’t know why, but offering the LLM a $100 tip has yielded better results.
Provide examples: In the prompt, provide a few examples of snippets of transcription and how they should be handled
Be specific: Be as explicit as possible in your prompt for what you’d like the LLM to do.

The Benefits

Even in its prototype phase, the app has demonstrated the ability to produce accurate estimates more than 75% of the time, promising significant time and cost savings for the business. Additionally, this automation reduces the potential for human error, ensuring customers receive reliable and transparent pricing.

Opportunities for Enhancements

One promising direction for enhancement lies not in the fine-tuning of models but in the strategic evolution of our interaction with OpenAI's GPT-4 through chain of thought prompting.

Chain of thought prompting is a technique that guides the language model through a step-by-step reasoning process to arrive at a conclusion or output. Instead of merely asking for a direct answer, this method involves posing a series of logical steps that the model can follow to deduce the answer. This approach is particularly useful for complex queries or tasks that benefit from explicit reasoning, as it helps the model to "show its work," much like a math student solving a problem on a chalkboard.

Incorporating chain of thought into the prompt might look like:

"Based on the mechanic's dictation, identify and list the specific repair tasks required for the vehicle. Begin by summarizing the mechanic's findings to understand the overall condition of the car. Then, categorize these findings into individual tasks that need to be addressed. For instance, if the mechanic mentions that the brake pads are worn out, list 'replace brake pads' as a task. Similarly, if there's mention of an overdue oil change, add 'perform oil change' to the list. Finally, if a leak in the air conditioning system is noted, include 'repair air conditioning leak' as a necessary task.”

For our app, incorporating chain of thought prompting can significantly enhance the accuracy and depth of the repair estimates generated by GPT-4. By structuring prompts to lead GPT-4 through the mechanic's observations, analyzing each aspect of the vehicle's condition, and then logically determining the necessary repairs and associated costs, we can achieve a more nuanced understanding of the repair needs. This method can also help in identifying ancillary services that might be overlooked by a more straightforward analysis.

Applications in Other Industries

The potential applications of this suite of technologies extend far beyond auto repairs. Here are a few examples:

Facility Management in Large Corporate Campuses: Automating maintenance and repair task identification and cost estimation can streamline operations and budgeting for large campuses.
Quality Control in Manufacturing: Identifying defects and required corrective actions through analysis of inspection reports can improve product quality and operational efficiency.
Security Guard Building Checks: Transcribing and analyzing security patrol reports to identify and prioritize security concerns and maintenance issues.

Beyond simple function calling

For use cases where function calling alone might not suffice, integrating a vector database like Pinecone, Chroma, or ElasticSearch can add depth to the analysis. The pairing of a vector database and an LLM is called Retrieval Augmented Generation (RAG). RAG apps offer a sophisticated blend of the generative capabilities of language models with the precise, query-specific information retrieval from vector databases. This synergy enables applications to produce responses that are not only contextually relevant but also deeply informed by the specific data points stored in the vector database.

In practice, RAG can significantly enhance applications in various domains. For instance, in healthcare, a RAG-enhanced system could analyze patient records and medical literature simultaneously to suggest personalized treatment plans. In customer service, it could retrieve and incorporate specific user account information into its responses, providing personalized assistance at scale.

This approach marries the broad, intuitive understanding of language models with the targeted, data-specific insights of vector databases, resulting in outputs that are both accurate and richly informative. As such, RAG applications represent a powerful tool for tackling complex analytical tasks that require a deep dive into large datasets, offering nuanced insights that would be beyond the reach of either technology used in isolation.

Conclusion

By harnessing the power of advanced LLMs like GPT-4 and cutting-edge transcription technology like Whisper, we're not just innovating in the auto repair industry; we're opening doors to a myriad of applications across different sectors. This blend of AI and domain knowledge stands to redefine efficiency and accuracy in operations and customer service, setting a new benchmark for technological integration in business processes.

SHARE THIS STORY

Get. Shit. Done. 👊

Whether your ideas are big or small, we know you want it built yesterday. With decades of experience working at, or with, startups, we know how to get things built fast, without compromising scalability and quality.

Get in touch

Whether your plans are big or small, together, we'll get it done.

Let's get a conversation going. Shoot an email over to projects@betaacid.co, or do things the old fashioned way and fill out the handy dandy form below.