The ‘Maneh’ Incident: A Cautionary Tale of Pronomina Detection

#NLP #Pronomina Detection #AI Hallucination #Linguistics #Indonesian Language #Sundanese #Coreference Resolution

Drama Pronomina: Saat AI Kamu Belajar Bahasa Sunda Tanpa Izin

Gila ya, setelah seharian aku jedotin kepala ke keyboard (figuratif sih, kalau beneran bisa mati syaraf aku), akhirnya sistem pendeteksi kesalahan pronomina aku jalan juga! Tapi... ada kejutan pahit bin lucu yang bikin aku pengen gelantungan di kabel LAN. Kamu liat screenshot-nya? Kata 'You' tiba-tiba berubah jadi 'Maneh'. BUSET. Emang sih 'Maneh' itu artinya Kamu, tapi masalahnya ini targetnya Bahasa Indonesia Formal, bukan Bahasa Sunda halus apalagi kasar! Tiba-tiba pipeline aku jadi punya kepribadian anak-anak tongkrongan di Bandung.

Masalahnya berawal dari dataset yang aku pake buat ngelatih model deteksi ini. Ternyata ada banyak entitas dari data Jawa Barat yang masuk tanpa filter ketat. Alhasil, si AI ngira 'Maneh' adalah variasi 'Kamu' yang lebih efisien karena jumlah karakternya mirip. Tantangan terbesar di dunia NLP emang Resolusi Koreferensi. Gimana caranya mesin tau 'siapa melakukan apa kepada siapa' dalam teks yang panjang tanpa ketuker kasta bahasanya.

Kenapa Deteksi Pronomina Itu Susah?

Konteks Ambigu: Pronomina sering ngerujuk ke subjek yang jauh di paragraf sebelumnya. Tanpa memori konteks yang kuat, mesin bakal tebak-tebak buah manggis.
Bias Bahasa Daerah: Karena dataset aku banyak campurannya (crawl liar dari sosmed), AI kadang 'halusinasi' pake dialek daerah yang frekuensinya tinggi.
Masalah Kalibrasi: Skor 100% dari AI itu ngga selamanya artinya 'Sempurna', tapi bisa jadi artinya 'AI kamu lagi males mikir' dan setuju sama apapun hasilnya.
Hierarchy of Honor: Bahasa Indonesia punya 'Kamu, Anda, Engkau', AI kadang bingung mana yang harus dipake buat raja dan mana buat kurir paket.

Bayangin kalau di game RPG epik level Baldur's Gate yang ribuan kata, AI tiba-tiba bingung nentuin dia/ia/kamu. Salah-salah, si Raja Iblis yang ngerusak dunia bisa dipanggil 'Lu' dan si gembel di pinggir jalan dipanggil 'Baginda'. Ironisnya, walaupun aku udah pake AI buat ngitung nilai kalibrasi skornya, tetep aja sistem skoring-nya ngaco. Masa semuanya dikasih nilai 100% sementara hasilnya masih ada bau-bau bau kencur kayak gitu?

Pelajaran hari ini: Teknologi itu asik, tapi kalibrasi manusia itu wajib hukumnya. Ngga mau kan lagi asik-asik main game serius tentang perang suci, tiba-tiba ksatria lawan bilang: 'Maneh teh bade ka mana? Kadieu atuh!'. Hahaha! Bakal aku fix lagi valuenya supaya dia lebih tahu batas antara bahasa resmi kenegaraan sama bahasa gaul wilayah priangan. Semangat riset buat aku, biar ngga ada lagi 'Maneh' di antara kita (kecuali kita lagi di Dago)!

The Sundanese Glitch: AI and the Great Pronoun Calamity

After a full day of literal (well, mostly metaphorical) head-banging against my workstation, my pronoun error detection system finally came alive! But as is often the case with AI development, it decided to give me a cheeky, unscripted surprise. In the latest regression test, the model translated 'You' consistently to 'Maneh'. For those not acquainted with regional Indonesian linguistics, 'Maneh' is a specific Sundanese pronoun for 'You'—often used among peers in a very informal setting. It’s perfect for a Bandung coffee shop, but it is devastating for a formal epic RPG script!

The issue stems from Coreference Resolution, one of the most stubborn hurdles in Natural Language Processing (NLP). This task involves linking pronouns ('he', 'she', 'it', 'they') back to the original entities correctly over massive spans of text. If the machine loses track of who is talking to whom, the immersion is instantly shattered. Imagine a legendary dragon referring to a brave knight as 'dude' or 'bestie'—that is essentially what my model was doing by injecting regional slang into a high-fantasy context.

Why Pronoun Detection Fails in Multilingual Models:

Referential Ambiguity: Pronouns often point to antecedents from three or four sentences prior. High-latency machines struggle with long-term memory windows.
Training Data Contamination: My massive web-crawled dataset had a hidden bias toward West Javanese social media posts, leading the AI to think 'Maneh' was a standard universal variant.
The Confidence Loop: The system reported a 100% confidence score. In AI speak, that doesn't mean it’s right; it often means the model has reached a point of 'logical over-fitting' where it no longer questions its mistakes.
Formality Grading: Indonesian and Sundanese have deep hierarchy levels (Krama/Lulugu). Teaching an AI to know when to be respectful and when to be blunt is like teaching a toddler diplomacy.

The screenshot I shared proves that even with millions of parameters, an AI can still sound like a lost tourist in West Java. My quality assurance pipeline initially didn't flag it because, technically, 'Maneh' exists in the dictionary of the broader Indonesian archipelago. This is why human-in-the-loop (HITL) calibration is non-negotiable in professional modding and localization. You can't just 'set it and forget it' unless you want your noble characters sounding like they're about to start a rock band in suburban Bandung.

I’ll be back at the drawing board tonight to tighten the weight of my loss functions and ensure regional leakage is minimized. Technology can get you 90% of the way to a perfect translation, but that final 10%—the human touch of knowing culture over raw data—is where the real magic happens. If you see 'Maneh' in your RPG character's mouth next week, please just tell me I'm fired! Cheers to the chaos of research!