Operational Efficiency

Improving the efficiency of market intelligence with LLMs

A media company wanted to extract key industry trends from quarterly company reports to improve their market intelligence capabilities. In this project, we demonstrated how large language models (LLMs) could extract valuable insights from vast amounts of unstructured data. For some tasks, our pipeline could even outperform the manual work of a financial analyst.

To protect confidentiality, we may alter specific details while preserving the accuracy of our core contribution.

Context & objectives

The objective of the project was to obtain clear and complete insights into sectorial trends and companies' strategies, transforming raw data into actionable market intelligence.

Our LLM pipeline needed to prove the feasibility of extracting the required information faster than a financial analyst while maintaining the quality of the data. This meant we needed to compare our results to that of an analyst manually processing the same data.

We identified four primary challenges that we needed to overcome to meet this goal:

  1. Scattered data

Around 500 company reports from quarterly shareholder meetings provided large volumes of rich, unstructured text data for analysis by large language models. However, our objective was to extract specific trends related to a single department.

This meant that there could be instances where only a few sentences would reference the desired topic in an entire report. For this reason, we needed to be cautious about the implications of sparse data on the results, especially its impact on accuracy.

  1. Exhaustivity

An analyst reading the content would not miss any essential market intelligence information. As a result, we had to ensure that our model did not miss any key information with a high degree of certainty.

  1. Reliability

Since an analyst reading the report wouldn't invent or change any information, we needed to make sure our model had no hallucinations. And when we didn't have perfect results, we had to set reasonable levels of confidence.

  1. Structure

Our solution had to enable pattern and trend detection in a structured format. We achieved this by successfully converting the text into structured data in Excel, enabling further quantitative and qualitative market intelligence analyses (like dashboards).

Approach

  1. Information filtering and structuring

We started by implementing information filtering and summarization from the raw text reports. The output of this first step was to create a list of key sentences for every report. Those key sentences contained and summarized all the insights that had to be structured.

The key sentences then had to be structured to fit an Excel format—the main challenge being the diversity of the information's formatting.

  1. Choosing the right model

Throughout the project, it became evident that the quality of the prompt significantly impacted the results. Even with the latest and more expensive GPT models, starting with a well-crafted and fine-tuned base prompt yielded better outcomes. Therefore, we found it essential to prioritize prompt quality over upgrading the model to achieve the best results.

Moreover, the trade-off between investment (time and money) and results was an important consideration for delivering cost-effective market intelligence. Going from GPT-3.5 to GPT-4 resulted in a 30X increase in costs. This amount is staggering, and developing a system that could balance the trade-off was crucial.

  1. Pooling system with multiple tailored models

To further enhance the accuracy of the content extraction, we used a technique called pooling. Instead of relying on a single model, we aggregated the results from multiple models. This technique resulted in a significant improvement in the accuracy of the content extraction by 50%.

LLM-based voting system

To ensure the challenge of reliability was satisfied, we introduced a voting system. This system involved running repeated queries with different models (GPT-3.5 and GPT-4) and assigning voting powers to each model. We selected the output with the highest number of votes as the result. If the number of votes didn't meet a certain threshold, we classified the extracted information as unreliable and subject to manual review.

The process of choosing the right LLM

The operating cost of the LLM pipeline was at least ten times cheaper than a full-time analyst.

We estimated that our prompt development process resulted in lower overall costs than similar manual work of a full-time analyst. Our initial estimates showed that the operating cost of the LLM pipeline was at least ten times cheaper than a full-time analyst.

Results

We developed a solution for our client using LLMs that could extract data faster and cheaper than an analyst that was as accurate and reliable, if not more.In this project, we had to overcome the challenge of exhaustivity, reliability, and structure in our approach. Doing so further proved that our client could successfully implement LLMs to eliminate time-intensive manual work and improve operational efficiency.


To safeguard confidentiality, we may modify certain details within our case studies.

Ready to reach your goals with data?

If you want to reach your goals through the smarter use of data and A.I., you're in the right place.

Ready to reach your goals with data?

If you want to reach your goals through the smarter use of data and A.I., you're in the right place.

Ready to reach your goals with data?

If you want to reach your goals through the smarter use of data and A.I., you're in the right place.

Ready to reach your goals with data?

If you want to reach your goals through the smarter use of data and A.I., you're in the right place.

© 2025 Agilytic

© 2025 Agilytic