π€ Voice logging quality improvements (parsing & matching)
What we're solving:
When you use voice or text logging to quickly log your meals, the app should understand common foods correctly. There are two critical issues preventing reliable food matching:
- βFood Matching Bug: The search algorithm returns incorrect matchesβfor example, typing "butter" matches "butterfish" instead of actual butter, and "blueberries" doesn't find anything even though blueberries exist in the database.
- βNatural Language Parsing: The parser doesn't understand conversational input patterns like "half a cup", "medium apple", or "one and a half cups".
This task fixes both issues to ensure voice logging feels natural and reliable.
Part 1: Fix Food Matching Algorithm (from PXL-826)
Bug Summary
The voice and text logging feature has critical food matching issues where common, everyday foods are not being matched correctly. The semantic search/embedding system appears to be returning semantically similar but incorrect matches, while overlooking exact or near-exact text matches.
Reproduction Steps
Issue 1: Incorrect Match for "butter"
- βOpen voice or text logging
- βSay or type "1 tablespoon of butter"
- βExpected: Matches "Butter" or "Butter, salted" or similar butter entry
- βActual: Matches "Butterfish, raw" - completely wrong food item
Issue 2: No Match for "blueberries"
- βOpen voice or text logging
- βSay or type "1 cup of blueberries"
- βExpected: Matches "Blueberries, Raw" which exists in the database
- βActual: No match found for "Blueberry"
Technical Analysis
This appears to be a search/matching algorithm issue with the following potential root causes:
Likely Causes to Investigate:
- βEmbedding Quality Issues
- βThe embeddings may not be properly weighting exact or near-exact text matches
- β"Butterfish" may be semantically closer to "butter" in the embedding space than actual "butter"
- βSingular/plural variations ("Blueberry" vs "Blueberries") may not be handled
- βSearch Algorithm Issues
- βThe matching algorithm may rely too heavily on semantic similarity without a text-matching boost
- βNo fuzzy matching or stemming for singular/plural forms
- βMissing a "prefer exact match" or "prefer substring match" heuristic
- βData Issues
- βVerify that "Butter" and "Blueberries, Raw" exist in the database with proper embeddings
- βCheck if embeddings were generated correctly for these common foods
Suggested Investigation Areas
- βReview the food matching/search implementation in the voice logging flow
- βCheck how embeddings are generated and stored
- βInvestigate adding a hybrid search approach:
- βText/fuzzy matching for exact/near-exact matches
- βSemantic search as a fallback
- βBoost scores for exact substring matches
- βConsider implementing stemming or lemmatization for word variations
Impact
High - This bug affects core functionality of voice logging, which is a primary feature for quick food entry. Users cannot reliably log common foods like "butter" and "blueberries", which significantly degrades the user experience.
Part 2: Enhance Natural Language Parsing (from PXL-825)
Overview
Enhance the voice logging feature's natural language parsing capabilities to support more conversational and varied input patterns. Currently, the parser handles basic quantity + unit + food patterns, but users naturally speak in more complex ways that need to be supported for a seamless logging experience.
Problem Statement
Users attempting to log food via voice encounter parsing failures or incorrect interpretations when using common natural language patterns such as:
- βFractional words ("half", "quarter")
- βCompound quantities ("one and a half")
- βDecimal numbers as words ("five point three")
- βSize descriptors without units ("medium apple")
- βPercentages in food names ("two percent milk")
- βBrand names as identifiers ("quest bar")
Technical Requirements
1. Fraction Word Recognition
- βParse "half" as 0.5 (e.g., "half a cup" β 0.5 cups)
- βParse "quarter" as 0.25 (e.g., "quarter cup" β 0.25 cups)
- βHandle optional articles ("half a cup" vs "half cup")
2. Compound Quantity Parsing
- βSupport "X and a half" patterns (e.g., "one and a half cups" β 1.5 cups)
- βSupport "X and a quarter" patterns
- βHandle standalone fractions ("one and a half" without units for countable items)
3. Decimal Number Words
- βParse "point" as decimal separator (e.g., "five point three" β 5.3)
- βSupport single decimal place patterns common in nutrition (e.g., "five point three ounces")
4. Unit-less Quantities with Size Descriptors
- βRecognize size words: "small", "medium", "large"
- βApply to countable foods (e.g., "one medium apple", "one medium banana")
- βMap size descriptors to appropriate serving size multipliers in food database lookups
5. Percentage in Food Names
- βPreserve percentage words as part of food name (e.g., "two percent milk" β food: "2% milk" or "two percent milk")
- βDistinguish between quantity percentages and food name percentages contextually
6. Brand Name Support
- βRecognize brand names as valid food identifiers
- βHandle brand + product patterns (e.g., "quest bar chocolate chip")
- βSupport fuzzy matching against food database for branded items
7. Complex Compound Phrases
- βParse multi-item phrases with "with" connector (e.g., "coffee with two tablespoons of half and half")
- βHandle the special case where "half and half" is a food name, not a fraction
Test Scenarios
The implementation must correctly parse all of the following scenarios:
Breakfast
| Input | Expected Quantity | Expected Unit | Expected Food |
|---|---|---|---|
| "Two scrambled eggs" | 2 | count | scrambled eggs |
| "Two slices of whole wheat toast" | 2 | slices | whole wheat toast |
| "One tablespoon of butter" | 1 | tablespoon | butter |
| "Half a cup of blueberries" | 0.5 | cup | blueberries |
| "Twelve ounces of coffee with two tablespoons of half and half" | 12 oz coffee + 2 tbsp half and half | * |
| (compound) |
Lunch
| Input | Expected Quantity | Expected Unit | Expected Food |
|---|---|---|---|
| "Six ounce grilled chicken breast" | 6 | ounce | grilled chicken breast |
| "Two cups of mixed greens" | 2 | cups | mixed greens |
| "Quarter cup of cherry tomatoes" | 0.25 | cup | cherry tomatoes |
| "Two tablespoons of olive oil and vinegar dressing" | 2 | tablespoons | olive oil and vinegar dressing |
| "One medium apple" | 1 | medium | apple |
| "Sixteen ounces of water" | 16 | ounces | water |
Snack
| Input | Expected Quantity | Expected Unit | Expected Food |
|---|---|---|---|
| "One ounce of almonds" | 1 | ounce | almonds |
| "One medium banana" | 1 | medium | banana |
| "Greek yogurt, five point three ounces" | 5.3 | ounces | Greek yogurt |
Dinner
| Input | Expected Quantity | Expected Unit | Expected Food |
|---|---|---|---|
| "Eight ounces of salmon" | 8 | ounces | salmon |
| "One cup of brown rice" | 1 | cup | brown rice |
| "One and a half cups of steamed broccoli" | 1.5 | cups | steamed broccoli |
| "One teaspoon of sesame oil" | 1 | teaspoon | sesame oil |
| "Twelve ounces of sparkling water" | 12 | ounces | sparkling water |
Quick Logging (Mixed Items)
| Input | Expected Parsing |
|---|---|
| "Peanut butter sandwich with two tablespoons of peanut butter" | Compound: sandwich + 2 tbsp peanut butter |
| "One cup of two percent milk" | 1 cup, food: "2% milk" |
| "Protein bar, quest bar chocolate chip" | 1 count, food: "Quest Bar Chocolate Chip" |
Implementation Notes
- βReview existing parsing logic in
NutriKit/Voice/directory - βKey files likely affected:
TextLoggingScreen.swift,VoiceLoggingScreen.swift,ParsedFoodRow.swift,VoiceEditSheets.swift - βConsider using a tokenizer approach that identifies quantity tokens, unit tokens, and food name tokens
- βMay need to maintain a dictionary of fraction words and their numeric equivalents
- βSize descriptors (small/medium/large) should be treated as special unit types
- βConsider edge cases where food names contain words that could be misinterpreted as quantities
Combined Acceptance Criteria
Food Matching Fix:
- β "1 tablespoon of butter" correctly matches a butter food entry (not butterfish)
- β "1 cup of blueberries" correctly matches "Blueberries, Raw" or equivalent
- β Singular/plural variations of food names match appropriately
- β Exact or near-exact text matches prioritized over purely semantic matches
- β Add unit tests for common food matching scenarios
Parsing Enhancements:
- β All test scenarios in the tables above parse correctly
- β Existing voice logging functionality remains intact (no regressions)
- β Parser gracefully handles ambiguous inputs with reasonable defaults
- β Unit tests cover all new parsing patterns
- β Performance remains acceptable (parsing should be near-instantaneous)
Integration:
- β Both fixes work seamlessly together
- β Users can now reliably voice log with natural language patterns
- β Food matching is accurate for common foods
Out of Scope
- βMulti-language parsing (English only for this iteration)
- βLearning user-specific food preferences (parsing enhancements only)
Build instruction: Use -destination 'platform=iOS Simulator,name=iPhone 17 Pro' when building this project
Comments (1)
Additional Cases Reported
Two more examples of incorrect food matching:
| Input | Amount Parsed | Matched Food | Expected |
|---|---|---|---|
| "2 cups of steamed broccoli" | β 2 cups (correct) | β "O's MΓΌsli, OREO" | Broccoli |
| "1 tablespoon of sesame oil" | β 1 tablespoon (correct) | β "O's MΓΌsli, OREO" | Sesame oil |
Note: The amount parsing is working correctly in both cases, but both queries are matching to the same completely unrelated food ("O's MΓΌsli, OREO"). This suggests the search/embedding matching has a more fundamental issue than just preferring similar-sounding foods.