Dictionaria -

Glottocode: wanu1241
ISO 639-3: wnk

Wanukaka dictionary

by Leif Asplund and Tomas Tagu Dodu and Allahverdi Verdizade

Introduction⇫¶

The Wanukaka language⇫¶

Wanukaka (wnk ISO 639, Glottocode wanu1241) is a language mainly spoken in the district Wanokaka¹, in the regency of West Sumba, in the southwest of the eastern Indonesian island of Sumba, mainly “in a large number of small settlements in and around the valley of the Wanokaka River” (Gunawan 2010:1). The present district Wanokaka contains the two traditional domains of Wanokaka and Rua, in both of which Wanukaka is spoken. To the east Laboya is spoken, to the north the Lolina dialect of Weyewa and to the west, Anakalang. All these languages belong to the Sumbanese group of Austronesian languages. The other Sumbanese languages are Kambera (or East Sumbanese), Mamboru, Baliledu-Ponduk, Garo, and Kodhi.

In Ethnologue (Eberhard et al. 2025), the number of speakers of Wanukaka is given as 10,000². According to ELP (Endangered Languages Project), the number of speakers is 14,000³. According to BPS (Badan Pusat Statistik) (2024:20), the population of the district Wanokaka is 20,635, and the great majority of them supposedly speak Wanukaka. In the district Waikabubak, there are also speakers of a variety of Wanukaka. Even though, as far as known, no sociolinguistic investigation of Wanokaka has been made, of for example how many resident speakers of other regional languages or only Indonesian there are in Wanokaka and to what extent children know the language, the number of speakers can perhaps, roughly, be estimated to 15,000-20,000⁴.

The dialects of Wanukaka are given as Wanukaka and Rua in Ethnologue and in Glottolog (Hammarström et al. 2025) as Nuclear Wanukaka and Rua. Rua was formerly a distinct domain, not included in Wanokaka, but to what extent Wanukaka as spoken in Rua is different from other varieties is no known. A difference between Wanukaka spoken in upper and lower Wanokaka valley has also been reported (Gunawan 2000:279). The language as spoken in the district Waikabubak is quite different, because of heavy influence from the Lolina dialect of Weyewa.

The status of the language is, according to Ethnologue, 6a* (vigorous), but in ELP the transmission level is given as Level 3 (“Some adults in the community are speakers, but the language is not spoken by children”) and the language is regarded as endangered (“60 percent certain, based on the evidence available”)⁵. This is the highest endangerment level for a Sumbanese language, equalled only by Mamboru. As for all Sumbanese languages, there are two levels of the language, one which is used in day-to-day communication, and one which is used in ritual and formal contexts. The second level includes fixed sayings in parallel language, the interpretation of which is often not obvious, especially for young people. This aspect of the language has been treated in Mitchell (1988) and more examples of parallel speech are given in Mitchell (2008b).

The language most closely related to Wanukaka is clearly Anakalang. This is also the conclusion of Budasi (2007) and Putra (2007)⁶. Budasi (2010) has further tried to show that Anakalang and Wanukaka are different languages, based on phonetical and lexica evidence, but most of his evidence is not valid for different reasons. Mitchell (1988:73) says, “Eastern Sumbanese is spoken over the eastern three-quarters of the island, with the Wanukaka dialect as one of the variations occurring in the most westerly parts of the east Sumba language area”⁷. The level of mutual intelligibility between Wanukaka and Anakalang has to be tested. According to Asplund (2010), Wanukaka belongs to the Central-Eastern Sumbanese languages, which also include Anakalang, Baliledu-Ponduk, Mamboru and Kambera (East Sumbanese). The geographically closest language to the east is Laboya, but the influence between the two languages seems to have been slight, except perhaps in the border regions in recent times. Because of traditional enmity between Wanokaka and Lamboya, the border area between the two domains was formerly uninhabited. In the 1970s, there was even a war between them, in which Lamboya conquered a part of the border area which traditionally belonged to Wanokaka. On the other hand, there have been trade and marriage connections with the southern coast of eastern Sumba, especially with Tidahu, but also with Tarimbang and Salura island (Gunawan 2000:27).

The origins of the population according to traditional history⇫¶

Before the Dutch occupation of Sumba in the early 20th century, written information about Wanokaka is extremely scarce. The main source is the clan (kaɓihu) myth of origin histories (kanuga), which have been investigated by Gunawan (2000:280-285). Gunawan distinguishes three main groups of clans according to the road by which they came to Wanokaka. One group of clans came overland through Anakalang and two of these clans are said to have become the original land-owners in the Wanokaka valley. Another group came to Cape Sasar, on the north coast of eastern Sumba, and travelled along the east and south coasts of Sumba until they reached Wanokaka. However, one clan of this group continued first to Gaura and Kodhi before they returned to Wanokaka, and another clan stayed for a long time in Tidahu, in eastern Sumba, before they continued to Wanokaka. Another group is said to have come from heaven to the highest mountain in Sumba, Yawila, in an area where the Weyewa language is spoken today, and then some of them moved to Mount Bodohula and Hodana in the Laboya area, and from there to the Wanokaka valley. The Wanukaka-speaking clans in Upper Loli (Loli Deta, Indonesian: Loli Atas) which come from Mount Lapale are also said to come from Yawila, but not via Hodana. One clan is said to have come from Tarimbang and Salura Island in eastern Sumba, where they intermarried with Sawunese people. Especially for the clans which are said to have come from Yawila, there is some archaeological evidence in large abandoned villages, mainly marked by tombstones, on the three mountains Mamodu Hill, Labere and Lapale, the first two in Wanokaka near the borders of Lamboya and Lapale in the Loli district. Other abandoned villages which seem to have a connection with Wanokakan history are Weilawa and Prai Lutang, both in present-day Anakalang.

Background of the present dictionary⇫¶

The author of the dictionary, Bapak Tomas Taga Dodo (‘Pa Tom’), is a retired university teacher and a native speaker of Wanukaka. The two redactors, Allahverdi Verdizade and Leif Asplund, have no practical knowledge of the language, but have some experience with other Sumbanese languages, Verdizade in writing his master thesis about Laboya and Asplund in some, mostly unpublished and unfinished, work on Baliledu, Laboya and Garo, mainly lexicographical and comparative. In fact, there is already a dictionary of Wanukaka, namely Mitchell (2008a), which contains fewer words than the present dictionary, and is unreliable in some respects, for example in distinguishing implosive and plosive sounds. In fact, Pa Tom started this dictionary with writing all words which he could think of which contained an implosive [ɓ], [ɗ] or [ʄ]. There are quite many words which are found in Mitchell (2008a) and were not included in the original manuscript for the present dictionary. If there seemed to be a good chance that these words were authentic Wanukaka words, they were included in the present dictionary with a citation. Some words from a 600 wordlist, collected by Asplund, were also included, as well as some terms which are found in Gunawan (2000). Our hope is that by bringing the dictionary into a presentable shape and translating it into English, it will be of great use for Sumbanese lexicography and will give an impetus to further study of Wanukaka.

The information given about the entries⇫¶

Information about the forms and selection of the lemmas are given in the next section. The lemmas are given in alphabetical order, with <ng> and <ny> included under <n>, but <ɓ>, <ɗ> and <ĵ> after <b>, <d> and <j> respectively. The maximal information about the lemmas are: 1. Classification of the lemma in a word class, as an affix or as an idiom, 2. Pronunciation, 3. Variants, 4. Meaning in English and Indonesian, sometimes including ethnographical information or information of the uses of objects, 5. Examples in Wanukaka with English and Indonesian translations, 6. Relationships to other words, 7. References to other words, 8. References to other sources than the manuscript written by the author. In fact, information about which words are used only in a ritual context should be given, but because no systematic information about this was given, this information is not found here other than sporadically. However, in many cases the reader can make some conclusions about this from the examples given.

There is no syntactic information which could decide which parts of speech could be assumed and which word belongs to which word class. This means that the assignment of word classes has to be conventional and only based on meaning, and in some cases quite arbitrary. However, it is hoped that the assignment of word classes will be of some use in searching for words of certain types. The open word classes used in the dictionary are nouns (n.), verbs (v.) and adverbs (adv.). It should be noted that adjectives are not included. Klamer (1998:115-118) has argued that adjectives do not constitute a separate word class in the closely related Kambera, and that could also be the case in Wanukaka. An additional reason for not including adjectives is that it proved hard in many cases to decide from the Indonesian translation if a word should be translated to English with a ‘verbal’, ‘adjectival’ or ‘participal’ meaning. All these words are classified as verbs in this dictionary. If a word is given with an English ‘adjectival’ meaning in the dictionary, like ‘red’, it has to be understood that a meaning, like ‘to be red’, should also be assumed. It should also be noted that the verb class has no subdivisions, like intransitive, transitive, stative and existential verbs. The closed word classes include articles (art.), conjunctions (conj.), interjections (interj.), prepositions (prep.), numerals (num.), quantifiers (quant.), numeral classifiers (cls.), particles (part.), personal pronouns (pron.) and demonstratives (dem.). Productive prefixes and suffixes are classified as affixes. The words in the dictionary belonging to personal pronouns, demonstratives, numerals and numeral classifiers are listed at the end of this preface.

Abbreviation	Part of speech
art	article
adv	adverb
cls	numeral classifier
conj	conjunction
dem	demonstrative
interj	interjection
prep	preposition
num	numeral
quant	quantifier
part	particle
pron	personal pronoun

Pronunciations which are not obvious from the written lemma are given in IPA, if found, from the Wanukaka recordings in Paradisec or the recording of 600 Wanukaka words made by Asplund.

Variants of the lemmas are given if the variant is considered to be the same word as the word given in the lemma, with only a difference in spelling, or a difference in pronunciation, dialectical or not. Variants differing from the lemma only in single versus double consonants, presence versus absence of vowel breaking, and variants between intervocalic <ɓ> and <q> ([ʔ]), are only given under the lemma. Other variants are given in their alphabetical position, with a reference to the main lemma.

The Indonesian meaning is generally given in the form it was given in the original manuscript of the dictionary. In some cases, when it seemed that the translation or explanation of a lemma was unnecessary long, it has been shortened. In several cases the English translation or explanation was attempted to be made more succinct than the Indonesian text, without loss of information. Sometimes the explanation contains more than just the meaning of the lemma, and can contain ethnographical information and the use or significance of the objects designated by the lemma. In the case of lemmas consisting of two or more parts, the ‘literal’ meaning of the expression is given in addition to the ‘real’, or idiomatic meaning of the expression. Examples are given to illustrate the meanings of the word in the lemma, and translated to English and Indonesian. Idiomatic compounds should be given their own lemmas, but other collocations of two words (i.e. such word combination where the meaning of the whole expression is derivable from the sum of its parts) should be given as examples. However, this distinction was sometimes hard to uphold, and some inconsequences can be expected here.

Parallel expressions should ideally constitute headwords, because they are idiomatic; however, due to the fact that they are often very long, parallellisms as a category has been demoted to examples in the present dictionary. If we know the the figurative as well and the literal meaning of the expression, bot are provided. Compare the following expression under the headword juaru:

Example of an idiomatic example sentence
Form	pari juaru moru – winu bata ɗeata
Figurative meaning	to die young
Figurative meaning (Indonesian)	orang yang meninggal pada usia relatif muda
Literal meaning	rice fell green – betelnut broke off upwards
Literal meaning (Indonesian)	padi jatuh hijau – pinang patah ke atas

The shape of some examples suggests that they are parallel expressions with a figurative meaning, but when the figurative meaning is not know, the example will only have the literal translation:

Example of an idiomatic example sentence with only the literal translation
Form	abi mu ɓeara ɗiya na watu na patogul - abi mu pata ɗiya na panuangu ɗeatang
Literal meaning	do not break the footrest stone - do not break the ladder for climbing
Literal meaning (Indonesian)	janganlah kamu memecahkan batu untuk tumpuan kaki - janganlah kamu mematahkan tangga untuk naik (di rumah kita)

Relationships to other words builds on the lexical function model and is described in Coward & Grimes (2000:121-136). Lexical functions that organize the lexicon in ontologies are supposed to reflect the ideas of the speakers about the world, thus e.g. taɗanu 'whale' has a 'kind of' (hyponym) relation to kaboku 'fish'. Not all availiable lexical functions are used in the present dictionary; for instance, the relation between ɗaga(ng) 'to sell' and hí 'to buy' is not, strictly speaking, that of counterparts, but rather of inversives. However, we have not used this function, subsuming it under Cpart instead.

Lexical function	Commentary	Example
Ant	Antonym; reserved for words corresponding to adjectives	belar 'wide' ⇆ hadanga 'narrow'
Caus	Causative derivation	mati 'to die' → hamati to kill
Compound	(Mostly) nominal compounds	hiɗu 'ill' → hiɗu ati 'sad; resentful'
Cpart	Counterpart, co-hyponym, coordinate term; sharing the same hyponym	arawei 'wife'⇆ lá 'husband', ɗaga(ng) 'to sell' ⇆ hí 'to buy'
Derivation	Other word-forming derivations	weha 'to open' → taweha '(to be) open'
Gen	Hypernym	taɗanu 'whale' → kaboku 'fish'
Idiom	Other idiomatic expressions	deki 'to perch' → penewi deki ɗeata 'to lie'
Ninst	Instrumental noun	keɗir 'to spin' → kija 'cotton spinning tool'
Nug	Undergoer noun	maruhi 'to harvest (rice panicles)' → pari 'unhusked rice'
ParS	The other part of a parallel synonymous pair, usually used together	nuku 'custom' ⇆ hara 'tradition'
Part	Partonym, meronym, part of the term	uma 'house' → ɓina 'door'
Phase	Phase of the devlopment of a term (living thing or plant)	ahu 'dog' ⇆ ana kuku 'puppy'
Sim	Similar, near-synonyms	ɓakul 'big' ⇆ karuangu 'huge'
Sound	The sound produced (mostly used for animals)	manu 'fowl, chicken'→ kaɗeaku 'to cackle'
Spec	Hyponym	kaboku 'fish' → taɗanu 'whale'
Syn	Synonym	uhi 'similar' ⇆ tudu 'similar'
SynL	A borrowed term for which there exists a native equivalent	adi 'younger sibling (mostly as address)' → naka 'younger sibling'
Whole	Holonym, what the term is a part of	ɓina 'door' → uma 'house'

References to Mitchell (2008a), Gunawan (2000) and the unpublished word list recorded by Asplund are given for words from these sources, but not found in the manuscript for the present dictionary.

Forms of the lexical entries and orthography⇫¶

The main principle was that the orthography should correspond with that used for Indonesian as far as possible. This means that <j> is used for [ɟ], <ny> for [ɲ], <ng> for [ŋ] and <y> for [j]. Other considerations were that the orthography should be phonemic, correspond with the practise of Pa Tom in writing the dictionary and with the present practice of writing Wanukaka. In two cases, our orthography deviates from this practice, i.e in writing <ĵ> for [ʄ] and <q> for [ʔ]. The usual symbol for writing the glottal stop in Sumba is <’>. However, the same symbol is often used for many other purposes, e.g. to indicate an implosive sound, to show that something is omitted and to mark stress. One of the reasons to use <ĵ> is that no capital letter of <ʄ> is found in the fonts. However, for the implosive [ɓ] and [ɗ], the IPA symbols are used here, as well as in the Wanukaka Bible translation (Alkitab Wanukaka n.d.). An example of following the phonemic principle is that the paragogic vowel [u] is in principle not written, because it can always be assumed when a word ends in a consonant, and is consequently not phonemic. In examples, especially in examples of ritual speech, it is was written in the original manuscript, and it is kept in those cases. It seems that some paragogic vowels were written in the entries in the manuscript for the dictionary, but they have been retained if there was uncertainty as to whether they are paragogic vowels. In words quoted from Mitchell (2008a), the paragogic vowel is deleted, if found. A potential problem here is that final non-paragogic vowels can also sometimes be omitted in pronunciation. However, this vowel is always written in the entries, but if it has been noted from recordings that a final vowel can be unpronounced, that has been noted in a phonetic transcription.

An example of following the phonemic principle is that the paragogic vowel [u] is in principle not written, because it can always be assumed when a word ends in a consonant, and is consequently not phonemic. In examples, especially in examples of ritual speech, it is was written in the original manuscript, and it is kept in those cases. It seems that some paragogic vowels were written in the entries in the manuscript for the dictionary, but they have been retained if there was some uncertainty if they were paragogic vowels. In words quoted from Mitchell (2008a), the paragogic vowel is deleted, if found. A potential problem here is that final non-paragogic vowels can also sometimes be omitted in pronunciation. However, this vowel is always written in the entries, but if it has been noted from recordings that a final vowel can be unpronounced, that has been noted in a phonetic transcription.

The problem with a phonemic orthography is that a phonemic analysis of the language has not been made. We are not certain of the phonemic status of the glottal stop, but it is quite likely that it is written in places where it is pronounced, even if it is not phonemic in all cases, for example between vowels. Even there, it seems optional in some vowel combinations. The glottal stop [ʔ] varies with the implosive labial [ɓ] between vowels (cf. maɓal/maqal 'to be silent, calm, to remain in place'). Even though the variants with glottal stops seem to be more common, the variants with a glottal stop are regarded as optional variants of the variants with the implosive labial. The glottal stop is written where it is found in the manuscript for the dictionary, except at morpheme boundaries, for example between the valency-increasing prefix pa- and a word beginning with a vowel (e.g. paana ‘to give birth’, from ana ‘child’). Words with final velar nasal [ŋ] often have variants without final velar nasal. In many cases, the preceding vowel is nasalised in such cases, but a following velar nasal is barely heard; this is especially salient in the final sequence /aŋ/, cf. omang [om(ː)ɑ̃(ŋ)] 'forest'. It seems possible that these words should be considered as having a final velar nasal phoneme, but that this phoneme is sometimes realized as a nasalisation on the preceding vowel only. These words are given with parentheses around the final velar nasal in the dictionary. A complication here is that a major function of the suffix -ng in Kambera (Klamer 1998:197-209) is to derive applicative verbs. It can thus be suspected that entries where a -ng is marked as optional, the -ng-form in fact has an applicative meaning.

The main problem for a phonemic orthography is the problem of long and short vowel in relation to long and short consonants. The usual way of dealing with this is to write double consonants after short vowels and single consonants after long vowels. However, in the absence of systematic measurements of consonant and vowel lengths, it is not possible to say if vowel length (or quality) or consonant length is phonemic. An impression is that for many words neither vowel nor consonant length is phonemically relevant, and that various pronunciations are possible in this respect. However, judging from the orthography in the manuscript for the present dictionary, length could make the difference in some minimal pairs, like inu ‘pith’ / innu ‘to drink’. However, the evidence for contrastive vowel or consonant length must be regarded as quite weak. Generally, the manuscript has double consonants quite rarely, but if a word is given with a double consonant, that form is given as an entry. In case a word is spelled in both ways, the form with a single consonant is given as the entry and that with a double consonant as a variant. If an entry in the manuscript differs from the corresponding entry in Mitchell (2008a) in this respect, the form in Mitchell (2008a) is given as a variant. It could be noted that <q>, <j>, <h>, <ny>, <ng>, <y> and <w> are never written double in the manuscript for the present dictionary or in Mitchell (2008a), although the word kabangnga ‘mute’ is found in the Wanukaka translation of Lukas 1:20. In addition, double <ĵ> is found only in mariĵĵi ‘to fasten with rope’ and meĵĵal ‘soft’ in the manuscript for the present dictionary, but both words are also written with single <ĵ> there.

Final stressed vowels in content words are always long, and they have been marked with an acute accent. In fact, the accent is not needed on content words consisting of one syllable, but it is useful to distinguish some content words from clitic function words (cf. hí ‘to buy; price’ vs. hi ‘proper article’), and also to indicate where the stress is in two-syllable words with final stress (makí ‘ashamed’). In the Bible translation to Wanukaka, double vowels are used instead in these cases. That also works fine.

How vowel sequences are to be analysed has to be left to future investigations. However, there are some things that should be noted: 1. Mitchell (n.d.) says that there are seven diphthongs, but he does not mention which they are. 2. There is a possible case of phonemic underdifferentiation in some vowel combinations. 3. Some vowel combinations are the results of vowel breaking.

In the dictionary, the following vowel combinations are not found: <ae>, <ao> , <ie>, <io>, <ue>, <uo>, <eo> and <oe>⁸. Similarly, some other vowel combination are very rare: <iu> (1-3 examples), <ui> (one example in Mitchell 2008a) and <eu> (1 example, only in ritual language). The results of vowel breaking are <ea>, <oa>, <oi> and <ia>, but these vowels combinations seem to be found also when there is no vowel breaking, at least at the end of words (kapea ‘star’, kaloa ‘banana’, kaboi ‘preserved or fermented material’, kamia ‘anus’). That they are found non-word finally is very doubtful for <ea>, <oa> and <oi>⁹, but quite possible for <ia>. The other vowel combinations are: <ai> (hai ‘comb’, haila ‘to load’), <au> (tau ‘to put’, paulu ‘k.o. shrimp’), <ei> (wei ‘water’, ɓeija ‘to lead’), <ou> (tou ‘person’, hagouta ‘many’) and <ua> (wua ‘fruit’, buatu ‘heavy’¹⁰).

In some vowel combinations, the first vowel can be long or short in Kambera (examples in Klamer 1998:23), which could be semantically significant, perhaps especially for <ai> and <au>. In the present dictionary, no difference is made between vowel combinations where the first vowel is long and where it is short. This could result in a phonemic underdifferentiation.

There is vowel breaking of /e/ to <ea> in the stressed syllable before /a/ and /u/ in an open final syllable (ɓera/ɓeara ‘to split’, keɗu/keaɗu ‘to steal’) and of /o/ to <oa> in the stressed syllable before /a/ and /i/ in an open final syllable (toma/toama ‘to arrive’, lori/loari ‘live’). In some words, however, /o/ becomes <oi> before a syllable with /i/ in the open final syllable (koiki ‘monkey’). In some words there is vowel breaking of /i/, which becomes <ia> before an /a/ in a final open syllable (ina/iana ‘mother’, iata ‘we (inclusive)’). In both the present dictionary and in Mitchell (2008), variants with and without vowel breaking are often given, even though vowel breaking is more frequent in the present dictionary. Vowel breaking is also found in Kambera and the Lolina dialect of Weyewa. In Kambera vowel breaking of /e/ and /o/ is found if /a/ is found in the next (final) syllable. Differences with Wanukaka seem to be that the vowel breaking in Kambera is obligatory and not influenced by the presence of a coda. Judging from the dictionaries, there is rarely vowel breaking in Wanukaka if there is a coda (<ng>, <h> in one case), vowel breaking is not found in all words and often there are variants with and without vowel breaking in the same word.

Long vowels are undoubtedly found in all words stressed on the final syllable. However, in the orthography used here, and also in the Bible translations and Mitchell (2008a), double vowels are written in those cases. The advantage is that one does not have to be doubtful about where the stress is in such cases. Very often, a word consisting of one syllable is historically the result of a contraction of a two-syllable word, so it seems quite possible that a phonemic analysis would posit geminated vowels in words stressed on the final syllable.

This dictionary builds on the hypothesis that Klamer’s (1998:16-31) analysis of the word in Kambera is also valid for Wanukaka. The prosodic word can be analysed in much the same way as Klamer (1998:30-31) analyses the prosodic word in Kambera. The ‘root’ consists of two syllables or one syllable with a long vowel, the first of which is stressed and can contain all vowel sounds (long and short /a/, /i/, /e/, /u/ and /o/ and perhaps some diphthongs). There can be up to two pretonic syllables, of which the vowel is always /a/, and the consonants /p/, /t/, /k/, /m/, /h/ and /l/ (examples: pagoru ‘to snore’, tamihik ‘scorpion’, kadoku ‘to dry’, maringu ‘cold’, halaku ‘to walk’, labiku ‘civet cat’, pakanoking ‘to mate (of animals)’). The root can be followed by a coda, consisting of one of the consonants /l/, /r/, /h/, /t/, /k/ and /ŋ/ (examples: ɓakul ‘big’, kawitar ‘mud’, tamelih ‘slippery’, lukut ‘to fold’, taliɗik ‘poisonous snake’, wulang ‘moon’) and the paragogic vowel [u].

Lexical entries (lemmas) contain the following categories: 1. Prosodic words¹¹ minus clitics, 2. Compounds containing two prosodic words, 3. Reduplications, 4. Clitics, 5. Some expressions. The stress on prosodic words is on the penultimate syllable or on the final syllable if the final vowel is long. Words consisting of more than one long syllable or containing more than one post-stress syllables or containing pre-stress syllables of an unnormal type¹², are considered to be compounds. If at least one of the parts of the compound can be identified with some certainty with a word which occurs by itself, the compound is divided into two constituent parts (oli uma ‘spouse, lit. 'friend-house'; ana lalu ‘orphan, child-?’), otherwise not (arawei ‘wife’). One problem is to distinguish a compound from a noun with a qualifier. This is illustrated by an example given in Mitchell (2008a), i.e. ana=gu moni ‘my son’ and ana moni=gu ‘my brother (woman speaking)’. In the second case, ana and moni are obviously more closely connected than in the first case, so it would seem reasonable to write ana moni ‘son’ and ana-moni ‘brother (woman speaking)’ and apply this distinction in all entries consisting of two words. However, we do not yet have the information to be able to do this in a systematic way. Kambera has several types of reduplication (Klamer 1998:35-42), so it could be expected that it is the case in Wanukaka also. However, the dictionary does not give much information about reduplication. When lemmas which contain reduplication is found, the parts are separated by a hyphen, e.g. nguru-nguru ‘to whisper’. One group of words which is very common in Kambera, and supposedly also in Wanukaka, is ideophones, but they are almost absent in the present dictionary.

Grammatical overview⇫¶

Because no grammatical investigation of Wanukaka has been made, the statements in this section should be regarded as preliminary, mainly based on the present dictionary and Mitchell (2008a)

Phonemes⇫¶

Consonant inventory⇫¶

	labials	dentals	palatals	velars	glottals
unvoiced plosive stops	p	t		k	(ʔ \<q>)
voiced plosive stops	b	d	ɟ	g
implosive voiced stops	ɓ	ɗ	ʄ <ĵ>¹³
fricatives					h
nasals	m	n	ɲ <ny>	ŋ <ng>
lateral		l
rhotic		r
approximants	w		j \<y>

The phonemic status of the glottal stop [ʔ] is uncertain (see under ‘Forms of the lexical entries and orthography’).

Vowel inventory⇫¶

According to Mitchell (n.d), there are “5 long vowel sounds, 5 short vowel sounds and 7 diphthongs” in stressed syllables, but only 3 (/a/, /i/ and /u/) in unstressed syllables. In stressed syllables, long and short forms of /a/, /i/, /e/, /u/ and /o/ are found. If there are any phonemic diphthongs has to be left for future analysis.

Affixes⇫¶

Productive derivational affixes seem to be:

Form	Function
pa-	Valency-increasing: paana ‘to give birth’ (ana ‘child’), pakalung ‘to repair’ (kalung ‘good’)
ha-	Derives quantifying words: hangahu ‘one hundred’, halolu ‘a strand’
ka-ng	kahanging ‘unit’ (hangi ’one’)

Clitics⇫¶

The morphemes in the dictionary which correspond to what Klamer (1998:27-29) considers to be clitics in Kambera, can be divided into the following groups: person clitics, articles, conjunctions, prepositions, negation, clause marker. Pronouns in Wielenga (1917), which differ in form from those given in this dictionary, are marked with (W). The set of personal pronouns exemplified by the 1SG ɗugu can perhaps be considered more polite or formal than the set initiated by nyouga.

Personal pronouns and person clitics

	Pers.pron.1	Pers.pron.2	Subject	Possessive	Object ¹⁴	Indirect object
1SG	nyouga, nyouwa, nyanga (W)	ɗugu	ku=	=gu	=wa	=ga
2SG	ou, ouwu	ɗumu	mu=	=mu		=gu
3SG	namu, nâmu (W)	ɗuna	na=	=na	=ya	=nya
1PL.INCL	iata, ĵiata	ɗuda, duta (W)	ta=	=da
1PL.EXCL	nyiama, nyèma (W)	ɗuma	ma=	=ma	=gama
2PL	nyiami, nyiemi, nyiâmô (W)	ɗumi, ɗimi	mi=	=mi
3PL	–, hâmu (W)	ɗuɗa	ɗa=	=ɗa	=ha	=ja

Other person inflected words are wi- ‘to say’ with possessive clitics and ei- ‘to be, be present, exist’ with object clitics. Subject clitics often combine with other clitics, as ɓa=/a=, da= ‘negation’ and ka= ‘so that-’ In the examples below, which are all from the present dictionary, some uses of the pronominal clitics are shown, mainly for crossreferencing the subject. In example (1) there is a transitive clause where the subject is crossreferenced by the subject clitic mu= ‘you’ and the object ‘the child’ by the object clitic =ya. However, in the intransitive clause (2), the subject, ‘our betelnut chewing’, is crossreferenced by the object clitic =ya. In the clause with the existential verb eingu (3), the subject is crossreferenced by the indirect object clitic =ja. In (4), the first, possessive, =gu crossreferences the first person subject, and the second, indirect object, =gu the second person object. In (5), the possessive clitic combined with the third person singular direct object clitic designates the continuative aspect.

(1)	mu=	pa-	ĵagal	=ya	na=	lakeaɗa
	2SG.SBJ	CAUS	afraid	3SG.OBJ	DEF.ART.SG	child
	‘You frighten the child.’

(2)	makupang	=ya	=ka	na=	happa	=da
	blocked	3SG.OBJ	PRF	ART.DEF.SG	betelnut_chewing	1PL.INCL
	‘Our betelnut chewing is blocked’

(3)	ei	=ja	=da
	be_present	3PL.IOBJ	1PL.INCL
	’They are present (...)’

(4)	na=	pa=	wi	=gu	=gu
	ART.DEF.SG	REL.OBJ	say	1SG.POSS	2SG.OBJ
	’Which I said to you’

(5)	ɓali	=gu=ya	nyouga
	return	CONT	1SG
	’I am returning.’ (i.e. I am asking for permission to leave now.)’

Except for the personal clitics which are mentioned together with the personal pronouns above, the following words seem to be clitics in Wanukaka. All, except the perfective ka are proclitics. Most of them have correspondences in Kambera (see Klamer 1098:27-28), which are given below for comparison.

Wanukaka	Function	Kambera
Articles
na	definite article, singular	na
ɗa	definite article, plural	ɗa
hi	proper article (before names and pronouns)	i
Junctions
ɓa / a	‘when; after’	ɓa
hi	‘then’	hi
ka	final ‘so that, in order to’	ka
Prepositions
ta, la-¹⁵	‘at, in, to’	la
Negation
da	negation	nda
Clause type markers
pa	object relative clause marker	pa
ma	subject relative clause marker	ma
Aspectual and mood particles
ka	Perfective	ka

Demonstratives⇫¶

close	quite close, visible	distant, visible	distant, not visible
nei ‘this’	nei lau	nei habali
nani ‘that’
			nimi habali
[nutu]	nutu lau		nutu habali

Numerals⇫¶

The units used in forming ordinal numerals are:

Wanukaka	English
hangi, ɗiha, ha-	one
ɗua, ɗabu, ɗuaɗa	two
tilu	three
patu	four
lima	five
namu	six
pitu	seven
walu	eight
hiwa	nine
-bulu	ten
ngahu	hundred
riɓu	thousand

For forming multiples of ‘ten’, habulu is used for ‘10’, ɗua m(a)bulu for ‘20’ and the units for 3-9 followed by m(a)bulu for 30-90. For multiples of ‘hundred’, hangahu is ‘100’, ɗua ngahu is ‘200’ and the units for 3-9 followed by ngahu are used for 300-900. For the numbers between the tens, the multiple of ten is followed by a unit number (hangi for ‘one’ and ɗabu for ‘two’). For numbers between the hundreds, a multiple of hundred followed by a multiple of ten and a unit number (hangi for ‘one’ and ɗabu for ‘two’).

Numeral classifiers⇫¶

Form	Classifier function
bela	broad and flat things
lolu	rope strands
lua	strands, ropes and sheets (?)
jirak	leaves and written things
pangu (sg.), bangu (pl.)	elongated objects as trees, logs, poles or cigarettes
mawa	for objects which come in pairs, as bracelets

Some words are combinations of numerals and classifiers where the parts are difficult to separate: heingu / heina ‘one (animal)’, ɗeingu ‘two (animals)’, ɗuba ‘two (fruits)’.

Footnotes⇫¶

¹ In this preface, Wanokaka is used if a geographical area (district, domain or river valley) is meant, and Wanukaka if the language is meant. In fact, the geographical area is called Wanukaka in the Wanukaka language, but because of the pronunciation of most of the languages in west Sumba, the official name of the district is Wanokaka.

² Source: Wurm \& Hattori (1981). The same number is given in Fox & Wurm (1983).

³ The source for the figure is given as BPS Sumba Barat. 2008. Sumba Barat dalam angka.

⁴ The number of speakers of Wanukaka given in Lovestrand (2021:27) as 1,200 seems far too low, even if the Wanukaka speakers in Rua are not included, and is perhaps based on a misprint in the source.

⁵ The source for the evidence is given as BPS Sumba Barat. 2008. Sumba Barat dalam angka and personal information from Prof. I Wayan Arka from 2013.

⁶ Putra (2007) concludes that there is only one Sumbanese language, but that conclusion is reached by an incorrect use of the method of dialectometry.

⁷ The same view is expressed in Gunawan (2000:1).

⁸ The existence of the vowel combination <ao> can be doubted, because it seems possible that in the three examples found in the dictionary, there could be a morpheme boundary, and consequently a glottal stop, between <a> and <o>.

⁹ It could be expected that woaya ‘crocodile’ (PMP *buqaya) is an exception, but it seems unlikely.

¹⁰ Buatu could be the result of vowel breaking in view of Kambera mbotu and the Proto-Malayo-Polynesian reconstruction *beReqat, but because the word is buatu in Anakalang, where vowel breaking seems more rare than in Wanukaka, if found at all, vowel breaking is preliminarily not assumed here.

¹¹ For the prosodic word in Kambera, see Klamer (1998:30, 31, 34). The same word structure is assumed for Wanukaka.

¹² Containing a vowel which is not /a/, or containing a consonant other than /p/, /t/, /k/, /m/, /h/ or /l/.

¹³ Often pronounced as a voiced affricate [dʒ] by younger speakers.

¹⁴ There are different types of object clitics, but they have not been possible to differentiate clearly yet.

¹⁵ Occurs only with the proper article hi (lahi, Kambera lai).

References⇫¶

Alkitab Wanukaka. n.d. Alkitab Wanukaka – Appar på Google Play. https://play.google.com/store/apps/details?id=id.alkitab.wanukaka
Asplund, Leif. 2010. The languages of Sumba. https://www.academia.edu/6964026/The_Languages_of_Sumba
Budasi, I Gede. 2007. Kekerabatan bahasa-bahasa di Sumba (studi kajian linguistik historis komparatif). PhD thesis, Universitas Gadjah Mada, Yogyakarta.
Budasi, I Gede. 2010. “Bukti-bukti leksikal pembeda Bahasa Wanokaka dan Anakalang di Sumba NTT.” Mabasan 4(1): 24–42.
BPS (Badan Pusat Statistik), Kabupaten Sumba Barat. 2024. Kecamatan Wanokaka dalam angka / Wanokaka District in Figures 2024.
Coward, David. F., Grimes, Chales E. 2000. Making Dictionaries: A guide to lexicography and the Multi-Dictionary Formatter. SIL International
Djawa, Alex. 2015. Buku pelajaran Hilu Wanukaka... (not seen)
Eberhard, David M., Gary F. Simons & Charles D. Fennig (eds.). 2025. Ethnologue: Languages of the World (28th ed.). https://www.ethnologue.com
ELP (Endangered Languages Project). n.d. Wanukaka. https://endangeredlanguages.com/
Fox, James J. (ed.). 1988. To speak in pairs: Essays on the ritual languages of eastern Indonesia. Cambridge: Cambridge University Press.
Fox, J. J. & S. A. Wurm. 1983. “Lesser Sunda Islands and Timor.” Map 40 in Wurm & Hattori.
Gunawan, Istutiah. 2000. Hierarchy and balance : a study of Wanokaka social organization Canberra: ANU.
Hammarström, Harald, Robert Forkel, Martin Haspelmath & Sebastian Bank. 2025. Glottolog 5.2. https://glottolog.org
Jagalimu, R. U. P. & Ni Wayan Kasni. 2018. “The meaning of the sign of Pasola...” Retorika 4(1): 1–14. https://ejournal.warmadewa.ac.id/index.php/jret
Klamer, Marian. 1998. A grammar of Kambera. Berlin–New York: Mouton de Gruyter.
Lovestrand, Joseph et al. 2020. Wanukaka wordlist. https://dx.doi.org/10.26278/5f453591885cb
Lovestrand, Joseph. 2019. Lamboya and Wanukaka orthography workshops: Consultant report.
Lovestrand, Joseph. 2020. Sumba wordlist and phrases. https://catalog.paradisec.org.au/collections/JL4
Lovestrand, Joseph. 2021. Languages of Sumba: State of the field. NUSA 70: 39–60.
Mitchell, David. n.d. Sounds and spellings in the Wanukaka language. (formerly online)
Mitchell, David. 1988. Method in the metaphor: the ritual language of Wanukaka. In James J. Fox, (ed.) To speak in pairs: essays on the ritual languages of eastern Indonesia. 64–86.
Mitchell, David. 2008a. Hilu Wanukaka: a working dictionary of the Wanukaka language. (formerly online)
Mitchell, David. 2008b. Pola peribahasa Wanukaka / 100 metaphoric couplets used in Wanukaka (formerly online)
Putra, A. A. Putu. 2007. Segmentasi dialektal bahasa Sumba di pulau Sumba: suatu kajian dialektologi. PhD thesis, Udayana University.
Verdizade, Allahverdi. 2019. Selected topics in the phonology and morphosyntax of Laboya. MA thesis, Stockholm University.
Verheijen, J. A. J. 1977. Bahasa Rembong di Flores barat I. Ruteng: Regio S.V.D.
Wedo, N. 2012. A descriptive study of pronouns of Wanukaka language (a language spoken in West Sumba) Kupang: Artha Wacana Christian University.
Wielenga, Douwe Klaas. 1917. Vergelijkende woordenlijst der verschillende dialecten op het eiland Soemba en eenige soembaneesche spreekwijsen. (Verhandelingen van het Bataviaasch Genootschap van Kunsten en Wetenschappen 61:5). Weltevreden: Albrecht & Co. / ’s Hage: M. Nijhoff.

Introduction
- The Wanukaka language
- The origins of the population according to traditional history
Background of the present dictionary
- The information given about the entries
- Forms of the lexical entries and orthography
Grammatical overview
Footnotes
References

Full Entry	Headword	Part of Speech	Meaning Description	Semantic Domain	Examples

Primary Text	Translation	Indonesian		IGT

Details	Name	Title	Year	Author	BibTeX type

Wanukaka dictionary cite