Minimally-lossy text simplification with Gemini

maio 6, 2025

8

Gemini-powered automatic evaluation and prompt refinement system

In order to achieve our goals, we developed an automated approach leveraging Gemini models for evaluation of simplification quality and self-refinement of prompts. However, crafting prompts for nuanced simplification, where readability must improve without sacrificing meaning or detail, is challenging. An automated system addresses this challenge by enabling the extensive trial-and-error needed to discover the most effective prompt.

Automated evaluation

Manual evaluation is impractical for rapid iteration. Our system employs two novel evaluation components:

Readability assessment: Moving beyond simplistic metrics like Flesch-Kincaid, we used a Gemini prompt to score text readability on a 1-10 scale. This prompt was iteratively refined against human judgment, enabling a more nuanced assessment of comprehension ease. We observed in testing that this LLM-based readability assessment aligns better with human readability assessments than Flesch-Kincaid.
Fidelity assessment: Ensuring meaning preservation is critical. Using Gemini 1.5 Pro, we implemented a process that maps claims from the original text to the simplified version. This method identifies specific error types like information loss, gain, or distortion, each weighted by severity, providing a granular measure of faithfulness to the original meaning (completeness and entailment).

Iterative prompt refinement: LLMs optimizing LLMs

The quality of the final simplification (generated by Gemini 1.5 Flash) heavily depends on the initial prompt. We automated the prompt optimization process itself via a prompt refinement loop: using the autoeval scores for readability and fidelity, another Gemini 1.5 Pro model analyzed the simplification prompt’s performance and proposed refined prompts for the next iteration.

This creates a powerful feedback loop where an LLM system iteratively improves its own instructions based on performance metrics, reducing reliance on manual prompt engineering and enabling the discovery of highly effective simplification strategies. For this work, the loop ran for 824 iterations until performance plateaued.

This automated process, where one LLM evaluates the output of another and refines its instructions (prompts) based on performance metrics (readability and fidelity) and granular errors, represents a key innovation. It moves beyond laborious manual prompt engineering, enabling the system to autonomously discover highly effective strategies for nuanced simplification over hundreds of iterations.

Previous articleQuantum Systems €160M funding – DRONELIFE

Next articleAmazon Cloud revenue misses estimates again as rivals pull ahead

Minimally-lossy text simplification with Gemini

Gemini-powered automatic evaluation and prompt refinement system

Automated evaluation

Iterative prompt refinement: LLMs optimizing LLMs

Pushing passkeys forward: Microsoft’s latest updates for simpler, safer sign-ins

Q&A: A roadmap for revolutionizing health care through data-driven innovation | MIT News

Making AI models more trustworthy for high-stakes settings

LEAVE A REPLY Cancel reply

Most Popular

Piezoelectric-immunomodulatory electrospun membrane for enhanced repair of refractory wounds | Journal of Nanobiotechnology

Samsung Galaxy S25 Receives 2025 ReMA Design for Recycling® Award – Samsung Global Newsroom

The Microsoft AI data center pullback that wasn’t

Self-Service Print on Demand at Thangs

Recent Comments

ABOUT US

POPULAR POSTS

Piezoelectric-immunomodulatory electrospun membrane for enhanced repair of refractory wounds | Journal of Nanobiotechnology

Samsung Galaxy S25 Receives 2025 ReMA Design for Recycling® Award – Samsung Global Newsroom

The Microsoft AI data center pullback that wasn’t

POPULAR CATEGORY