{"id":103,"date":"2021-10-28T16:21:53","date_gmt":"2021-10-28T16:21:53","guid":{"rendered":"https:\/\/openbooks.library.unt.edu\/sourcetoanalysis\/?post_type=chapter&#038;p=103"},"modified":"2024-03-04T00:25:19","modified_gmt":"2024-03-04T00:25:19","slug":"chapter-6","status":"publish","type":"chapter","link":"https:\/\/openbooks.library.unt.edu\/sourcetoanalysis\/chapter\/chapter-6\/","title":{"raw":"Annotation: IGT Workflow","rendered":"Annotation: IGT Workflow"},"content":{"raw":"<div class=\"textbox textbox--learning-objectives\"><header class=\"textbox__header\">\r\n<p class=\"textbox__title\">Learning Objectives<\/p>\r\n\r\n<\/header>\r\n<ol>\r\n \t<li>Learn and apply methods for lexicon creation<\/li>\r\n \t<li>Learn and apply methods for hierarchical glossing<\/li>\r\n<\/ol>\r\n<div><\/div>\r\n<\/div>\r\n<h2><strong style=\"text-align: initial;font-size: 14pt\">6.1 Glossing Standards and Conventions<\/strong><\/h2>\r\nTranscription provides a representation of the sounds of an utterance. \u00a0Translation offers access to the information in the utterance in another language. \u00a0 Interlinear glossing provides representation of the utterance at a word and subword level.\r\n\r\nBy dividing an utterance into morphemes and noting the meaning or function of each morpheme individually, interlinear glossing makes your data more usable researchers in building word and sentence grammar. \u00a0It also helps with creating pedagogical materials for language teaching.\r\n\r\nThe most common conventions for interlinear glossing are the Leipzig glossing rules, available here:\r\n\r\n<a href=\"https:\/\/www.eva.mpg.de\/lingua\/pdf\/Glossing-Rules.pdf\">https:\/\/www.eva.mpg.de\/lingua\/pdf\/Glossing-Rules.pdf<\/a>\r\n\r\n&nbsp;\r\n<div class=\"textbox shaded\">\r\n<h4 style=\"text-align: center\">Leipzig Glossing Rules<\/h4>\r\nIn the example below, there are four lines of text: (1) the transcription, (2) morpheme by morpheme gloss, (3) the word gloss, and (4) the free translation. Morphemes in line (1) are aligned to their glosses in line (2) and (3).\r\n\r\n1)\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0 Bawi=ni\u0294\u00a0\u00a0\u00a0 \u00a0\u00a0\u00a0 Chin \u00a0\u00a0 tsoosaa \u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0 a=eii-t\u0259\u0279\u0279\r\n\r\n2)\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0 Bawi=ERG \u00a0 \u00a0 Chin \u00a0\u00a0 beef \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 3S=eat-CAUS\r\n\r\n3) \u00a0 \u00a0 \u00a0 \u00a0 Bawi \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0Chin \u00a0 \u00a0 beef \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 made.to.eat\r\n\r\n4)\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0 \u2018Bawi made Chin eat the beef.\u2019\r\n\r\nMorphemes such as <em>tsoosaa<\/em> \u2018beef\u2019 and <em>eii <\/em>\u2018eat\u2019 provide <strong>content<\/strong> information to the clause - these will be nouns and verbs, or they may be adjectives and adverbs. \u00a0Here, we see them \u00a0translated into English in (3). The constituent 'made to eat' is more complicated. \u00a0It has a morpheme that relays content information (i.e. 'eat') and also morphemes that relay grammatical information \u00a0(i.e., causative and third singular, abbreviated as 'caus' and '3s'), respectively. \u00a0Content morphemes are glossed in lower case. \u00a0Grammatical morphemes, such as the ergative (erg) marker <em>ni\u0294<\/em> and the causative (caus) marker <em>t\u0259\u0279\u0279<\/em> are glossed in caps in (2). \u00a0Also, grammatical morphemes, such as <em>t\u0259\u0279\u0279<\/em> (caus) are divided from other morphemes in the same word with a hyphen (-). Clitics, such as the third person singular subject agreement marker (3s), are divided from other morphemes in the same word with an equals sign (=).\r\n\r\n<\/div>\r\n&nbsp;\r\n<div class=\"textbox\">Discussion:\u00a0 What are the main principles and goals of the LGR?\u00a0 To help you with the discussion, read:\u00a0 Chelliah, Shobhana, Mary Burke, and Marty Heaton. (2021). Using interlinear gloss texts to facilitate cross-language comparison and improve language description.\u00a0 <em>Indian Linguistics<\/em>. Indian Linguistics 82(1-2) 2021: 1-24.<\/div>\r\n&nbsp;\r\n\r\nForming an analysis and applying it to data is doing the science of linguistics, and the process is <strong>recursive. \u00a0<\/strong>Data informs analysis and analysis refines data. The hardest part of approaching glossing in a new language is starting. When you have no analysis to work from, every word is a puzzle, or several related puzzles. \u00a0What we aim for is glossing that is:\r\n<ul>\r\n \t<li><strong>Accessible:<\/strong> It should allow even a novice glosser to move through the text even if every morpheme cannot be completely glossed where more analysis is required.<\/li>\r\n \t<li><strong>Not relayed only by English translations: <\/strong>Translations, especially of grammatical morphemes, are likely to be imprecise and may result in unclear glosses or mistranslations in other sentences.<\/li>\r\n \t<li><strong>Complete: <\/strong>It should reflect the hypotheses developed and knowledge gained by the glosser during the course of the work. It should include both syntactic and semantic information relevant to future studies. It should reflect known information as well as unknown information transparently.<\/li>\r\n \t<li><strong>Searchable:<\/strong> Terms should appropriate for use with more than one language and represented consistently so that all instances can be easily found.<\/li>\r\n \t<li><strong>Adaptable:<\/strong> Both commonly used category labels and language specific\/researcher specific glosses should be possible.<\/li>\r\n<\/ul>\r\nThis set of decisions that need to made to reach such glossing conventions can be overwhelming, leaving researchers feeling like they need more training before they can even begin. \u00a0Luckily, however, we know more than we think we know and, as we discuss in the rest of this chapter, the novice glosser can use this knowledge for glossing:\r\n<ol>\r\n \t<li>We know or we can learn the expected grammatical categories for a language based on related languages - \u00a0Typology can inform glossing!<\/li>\r\n \t<li>We can often guess from context the meaning of grammatical morphemes a even if we don't have a name for it - Placeholder labels are our friends!<\/li>\r\n \t<li>We can differentiate between different senses of a morpheme - \u00a0Glosses can reflect the historical development of a morpheme!<\/li>\r\n<\/ol>\r\nStandards, such as Leipzig Glossing Rules (LGR), are beneficial for data consistency. \u00a0Such consistency can help with comparison of forms within a corpus and with cross-language comparison. \u00a0 However:\r\n<ol>\r\n \t<li>LGR will not provide the vocabulary needed for semantic glossing of all morphemes in all languages. As you learned from the reading above, this was not the goal of LGR. So, even with LGR printed out and at your side when glossing, you will have to come up with glosses for morphemes.<\/li>\r\n \t<li><span style=\"text-align: initial;font-size: 14pt\">While a language expert, your consultant, can tell you the meanings of many content morphemes, they will likely not be able to gloss grammatical morphemes. \u00a0<\/span><\/li>\r\n<\/ol>\r\n<h2><strong>6.2 Clause and Phrase Segmentation<\/strong><\/h2>\r\nAfter speech has been transcribed and translated, the next step is to segment speech into clauses, phrases, words, and morphemes. Create a guide for yourself on what you expect to find at the edge of a clause.\u00a0For Tibeto-Burman languages, this will often be a clause chaining mechanism such a nominalizer or other clausal subordinator. \u00a0Making list of members of the closed category of subordinators will help you with identifying clause boundaries.\r\n\r\nWhen using the automatic segmentation tool of SayMore or ELAN, the pauses at the edges of intonational chunks will automatically brake connected speech into what roughly corresponds to a phrasal or clausal constituent boundary.\r\n\r\nNative speakers will usually be more comfortable matching larger phrases between the transcription and translation than individual words and morphemes. \u00a0Here the sentence from above is segmented roughly into phrases:\r\n<div class=\"textbox\">\r\n\r\n[tii vaa poo\u014b pa khat a\u0294]\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0 [ruul lee h\u014ber\u0294 tee] \u00a0\u00a0\u00a0\u00a0 [an rak um] \u00a0\u00a0\u00a0 [an tii]\r\n\r\n[Near a river]\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0 [snake and ant]\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0 [there was]\u00a0\u00a0\u00a0\u00a0\u00a0 [they say]\r\n\r\n\u201cThere were a snake and an ant near a river, they say.\u201d\r\n\r\n<\/div>\r\n<h2><strong>6.3 Free morphemes, Bound Morphemes, and Clitics<\/strong><\/h2>\r\nNative speaker intuitions are key to the segmentation of utterances into morphemes. Be aware, however, that an additional challenge comes in how morphemes are represented in the practical orthography. Consider the following example from a story in Hakha Lai called \u201cThe Snake and the Ant\u201d from Roengpitya (LTBA 20.2 p.44 R. Roengpitya).\r\n<div class=\"textbox\">\r\n\r\ntii vaa poo\u014b pa khat a\u0294 ruul lee h\u014ber\u0294 tee an rak um an tii\r\n\r\n\u201cThere were a snake and an ant near a river, they say.\u201d\r\n\r\n<\/div>\r\nHere both lexical and grammatical \u00a0morphemes are written separately. The orthography treats these both the same even though in many cases the grammatical morphemes are bound (must occur with another form) and the lexical morphemes are free (can occur without other morphology). \u00a0 On the morpheme representation and morpheme glossing lines, we can make clear if \u00a0we are dealing with bound or free morphology. \u00a0For example, a morpheme with a dash before or after might signal a bound affix.\r\n<h2><strong>6.4 Morpheme Glossing<\/strong><\/h2>\r\n<div class=\"textbox\">\r\n\r\nDiscussion: Once a clause is divided into phrases and you have the free translation, what next steps will you take to further divide the phrase into content and function morphemes? How will you:\r\n<ul>\r\n \t<li>Discover where the head of phrase is?<\/li>\r\n \t<li>Develop a hypothesis about the order of morphemes?<\/li>\r\n \t<li><span style=\"text-align: initial;font-size: 0.9em\">Identify the edge of\u00a0 a clause or phrase?<\/span><\/li>\r\n \t<li>Determine if a morpheme is free, bound, or a clitic?<\/li>\r\n \t<li>Gloss a \u00a0grammatical morpheme?<\/li>\r\n<\/ul>\r\n<\/div>\r\n<h3>Identifying grammatical morphemes<\/h3>\r\nWhether you are a speaker of the language doing the translation yourself, or you are working with a speaker to arrive at the translation you might pay attention to these types of comments:\r\n<div class=\"textbox textbox--examples\"><header class=\"textbox__header\">\r\n<p class=\"textbox__title\">Morphemes that correspond to functional categories in English (pronouns, adpositions, auxiliary verbs, etc.)<\/p>\r\n\r\n<\/header>\r\n<div class=\"textbox__content\">\"That word corresponds to the <em>to<\/em> part of the phrase\"<\/div>\r\n<\/div>\r\n<div class=\"textbox textbox--examples\"><header class=\"textbox__header\">\r\n<p class=\"textbox__title\">Morphemes that require description<\/p>\r\n\r\n<\/header>\r\n<div class=\"textbox__content\">\r\n\r\n\u201cThis word means that the action was <em>performed earlier today<\/em>\u201d\r\n\r\n<\/div>\r\n<\/div>\r\n<div class=\"textbox textbox--examples\"><header class=\"textbox__header\">\r\n<p class=\"textbox__title\">Morphemes that don\u2019t change the meaning of the lexical word.<\/p>\r\n\r\n<\/header>\r\n<div class=\"textbox__content\">\r\n\r\n\u201cThat word attached to <em>eat<\/em> means <em>the process of<\/em> <em>eating<\/em>\u201d\r\n\r\n<\/div>\r\n<\/div>\r\n<div class=\"textbox textbox--examples\"><header class=\"textbox__header\">\r\n<p class=\"textbox__title\">Morphemes that don\u2019t mean the same thing when split from the lexical word.<\/p>\r\n\r\n<\/header>\r\n<div class=\"textbox__content\">\r\n\r\n\u201cThis word means that the tiger is a woman, but it doesn\u2019t mean <em>woman<\/em> by itself.\u201d\r\n\r\n<\/div>\r\n<\/div>\r\n<div class=\"textbox textbox--examples\"><header class=\"textbox__header\">\r\n<p class=\"textbox__title\">Morphemes with no clear meaning<\/p>\r\n\r\n<\/header>\r\n<div class=\"textbox__content\">\r\n\r\n\u201cI don\u2019t know what that word means.\u201d\r\n\r\n<\/div>\r\n<\/div>\r\nThese and other native speaker intuitions are helpful clues as to whether a morpheme is lexical or grammatical and bound or free.\r\n<h3>Naming grammatical morphemes<\/h3>\r\nRecall that in the last chapter we were offered the free translation \u201cJean could go\u201d for the French sentence below. \u00a0We asked a few questions to a native speaker, allowing us to associate English translations with each word in the French sentence.\r\n<div class=\"textbox\">\r\n\r\nJean \u00a0\u00a0\u00a0 pourrais \u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0 aller.\r\n\r\nJean\u00a0\u00a0\u00a0\u00a0 could\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0 go\r\n\r\n\u2018Jean could go.\u2019\r\n\r\n<\/div>\r\nWhile the name <em>Jean<\/em> and the verb <em>aller<\/em> \u2018go\u2019 offer more straightforward translations, <em>pourrais<\/em> is more problematic, since it is glossed with the English modal verb \u2018could\u2019 which has several possible interpretations that may or may not line up with the French word including:\r\n<ul>\r\n \t<li>Ability: Jean <em>could<\/em> go to the store (before the accident).<\/li>\r\n \t<li>Possibility: Jean <em>could<\/em> go to the store (on his way home, if he thinks of it).<\/li>\r\n<\/ul>\r\nFurthermore, a native speaker might tell us that <em>pourrais<\/em>:\r\n<ul>\r\n \t<li>Has other forms including \u2018pouvoir\u2019 and \u2018peux\u2019 (indicating that the word has several possible forms)<\/li>\r\n \t<li>May also be translated as <em>might<\/em>, <em>can<\/em>, or <em>may <\/em>(indicating that its translation into English is contextual).<\/li>\r\n<\/ul>\r\nWe have now determined that glossing for <em>pourrais<\/em> is not a simple matter! \u00a0Here are the possible strategies we can follow:\r\n<div class=\"textbox shaded\">\r\n<h4>So how do you gloss it?<\/h4>\r\n<strong>Mark it with &lt;?&gt; and deal with it later.<\/strong> This is the easiest option, but it ignores any information collected about the morpheme in the process of glossing the text. Since it ignores collected information and moves on, it allows the glosser to cover more ground faster.\r\n\r\n<strong>Translate it with a similar word in the glossing language<\/strong> (like \u2018might\u2019 or \u2018could\u2019). The advantage of this option over (1) is that it keeps track of some information discovered in the process of glossing. This option is a workable option for lexical morphemes but ultimately confusing for grammatical ones as:\r\n<ul>\r\n \t<li>It is imprecise (<em>might<\/em> can express possibility or permission)<\/li>\r\n \t<li>It assumes more analysis exists than has been done<\/li>\r\n \t<li>It assumes the translatability of grammatical morphemes between languages that may not have accurate correspondences.<\/li>\r\n \t<li>It also may lose information about the syntactic position of the morpheme since words with modal meanings may be expressed in a number of ways in a language not limited to a certain syntactic position (as adverbials or verbs with complement clauses).<\/li>\r\n \t<li>It makes the text less searchable, since someone looking to do analysis on modals would have to use a set of English modals (could, would, should, must, can, etc.) to find the data to begin analysis. In other words, it is not tagged in a way that facilitates research.<\/li>\r\n<\/ul>\r\n<strong>Give it a name like \u2018possibility\u2019 noting whatever your preliminary analysis is, based on context and translation.<\/strong> This option has some of the advantages of (2) in that it keeps information gained from the process of glossing. It is more precise (\u201cthis morpheme expresses possibility\u201d) than giving it a loose translation, but it suffers from the same searchability problem. If morphemes are marked with whatever the researcher feels is the closest analysis, how do you search to find all things with like distribution to further analyze the data?\r\n\r\n<strong>Mark it as a \u2018modal\u2019 or \u2018verbal auxiliary\u2019 element<\/strong>. This option offers a broad category label and is more searchable, but loses more nuanced information about semantics gleaned from the process of glossing.\r\n\r\n<\/div>\r\n<h3><\/h3>\r\n<h3 style=\"text-align: center\">General grammatical category label<\/h3>\r\nThe method we offer here to meet the challenge of \u00a0how to gloss a grammatical morpheme involves using a two part name:\r\n<ul>\r\n \t<li>Part 1: \u00a0the more clearly understood information like the general category to which the morpheme belongs<\/li>\r\n \t<li>Part w: \u00a0semantic information which is the meaning relayed by the morpheme as indicated by the free translation<\/li>\r\n<\/ul>\r\nFor example, aspect is a general grammatical category and \u00a0progressive is specific type of an aspect morpheme. \u00a0So, the two part name would be aspect:progressive. \u00a0It may be that you know the left side of the equation or the right side or both. \u00a0You can more rapidly and confidently gloss by stating what you do know and leaving what you don't know for later. \u00a0The system proposed here allows researchers to flexibly represent their understandings of word and morpheme glosses, revising them with increasingly deeper levels of analysis. \u00a0The first step in making this method of hierarchical interlinear glossing (HIGT) work is to create a list of recurring grammatical categories. \u00a0And this is where our typological understanding of Tibeto-Burman can help!\r\n<div class=\"textbox textbox--sidebar textbox--examples\"><\/div>\r\n<div class=\"textbox textbox--examples\"><header class=\"textbox__header\">\r\n<p class=\"textbox__title\">Examples of grammatical categories<\/p>\r\n\r\n<\/header>\r\n<div class=\"textbox__content\">\r\n\r\nAGR:\u00a0 inflection agreement morphology for agent or patient and may vary according to person and number\r\n\r\nDIR:\u00a0 derivational morphology primarily indicating direction of action of motion verb\r\n\r\nSUB:\u00a0 subordinators affixed to finite or nominalized clauses to create subordinate clauses in clause chaining or as complements to verbs<strong>.<\/strong>\r\n\r\n<\/div>\r\n<\/div>\r\n<h3><\/h3>\r\n<h3>Specific semantic category label<\/h3>\r\nThis is where the specific instantiation of the grammatical category (e.g., subordinator, directional or agreement marker) is identified with a semantic label.\u00a0 For example:\r\n\r\n(1) subordinator: simultaneous\r\n\r\n(2) subordinator: sequential\r\n\r\n(3) subordinator: after acting\r\n\r\nThe general grammatical category is subordinator. \u00a0The specific instantiations are simultaneous or sequential.\r\n\r\nCompare (2) and (3). \u00a0Imagine that annotator A writes (2) and annotator B writes (3). \u00a0Annotator B might simply not know the term sequential, but both annotators are referring to the same morpheme. \u00a0We can recapture at least some of that information by looking at the left side of the colon and the morpheme itself.\r\n\r\nIt is difficult, to put it mildly, to define the semantics of a morpheme and incorporate all aspects of information about that morpheme in a gloss.\u00a0 Simply standardizing across all annotators may actually hide the nuances of meaning we gain from what is on the right side of the colon.\u00a0 So, both left and right side glossing as a pair will provide the best tracking.\u00a0 It will also allow the annotator to write down with certainty the left side while figuring out the right side.\r\n\r\nHere are some clues that can go into your decision on the semantic label for a morpheme.\u00a0 These sources of hypotheses are valuable and merit preserving, especially in forming early analyses.\r\n<div class=\"textbox shaded\">\r\n<p style=\"text-align: center\"><strong>Information from existing sources on the language<\/strong><\/p>\r\n<p style=\"text-align: center\"><em>A previous researcher calls this a past tense marker.<\/em><\/p>\r\n&nbsp;\r\n<p style=\"text-align: center\"><strong>Contextual information from the text about position and meaning<\/strong><\/p>\r\n<p style=\"text-align: center\"><em>Past tense information in the translation doesn\u2019t appear elsewhere in the sentence, so it may be tied to this morpheme<\/em>.<\/p>\r\n&nbsp;\r\n<p style=\"text-align: center\"><strong>Translations, meta-information, and other intuitions offered by native speakers<\/strong><\/p>\r\n<p style=\"text-align: center\"><em>The language assistant said, \u2018I think that part means that it happened already.\u2019<\/em><\/p>\r\n&nbsp;\r\n<p style=\"text-align: center\"><strong>Typological and theoretical research on related phenomena<\/strong><\/p>\r\n<p style=\"text-align: center\"><em>Usually, morphemes with this kind of behavior are called past-tense markers in other languages.<\/em><\/p>\r\n&nbsp;\r\n<p style=\"text-align: center\"><strong>Analyses of phenomena in related languages<\/strong><\/p>\r\n<p style=\"text-align: center\"><em>This morpheme looks like a past tense marker in a related language.<\/em><\/p>\r\n<p style=\"text-align: center\"><strong>Your own intuition<\/strong><\/p>\r\n<p style=\"text-align: center\"><em>Based on my past experiences, this seems to be marking past tense<\/em>.<\/p>\r\n\r\n<\/div>\r\nYou can read more about the process of semantic glossing here: Bochnak, M. Ryan and Lisa Matthewson 2020.\u00a0<a href=\"https:\/\/linguistics.sites.olt.ubc.ca\/files\/2019\/07\/BochnakMatthewsonAnnualReviewSemanticFieldwork.pdf\">Techniques in complex semantic fieldwork.<\/a>\u00a0Annual Review of Linguistics 6:261-283.\r\n\r\n&nbsp;\r\n<h3 style=\"text-align: center\">Updating analyses<\/h3>\r\nGlosses are analyses, and analyses require updates. Consider the following Zophei example:\r\n<div class=\"textbox\">\r\n\r\na-pa-va-ming\r\n\r\n\u2018He checked up on me.\r\n\r\n<\/div>\r\nIn this example, we have identified:\r\n<ul>\r\n \t<li>That the first two morphemes in this complex word are agreement markers associated with a 3rd person singular subject (AGR:3SBJ) and a 1st person object (AGR:1OBJ).<\/li>\r\n \t<li>That <em>vaming<\/em> means 'to check up on', but <em>ming<\/em> means 'to watch'.<\/li>\r\n \t<li>The native speaker doesn't have a clear intuition about what <em>va<\/em> means by itself saying it may be related to the sense that you go somewhere to check up on someone, or to the event happening in the past.<\/li>\r\n<\/ul>\r\nWe decide that <em>va<\/em> is a morpheme that we should gloss, but we are unsure what to call it without more data.\r\n<div class=\"textbox\">\r\n\r\na-pa-<strong>va<\/strong>-ming\r\n\r\nAGR:3SBJ-AGR:1OBJ-_______-watch\r\n\r\n\u2018He checked up on me.\r\n\r\n<\/div>\r\n<h3 style=\"text-align: center\">First pass analysis- placeholder for position<\/h3>\r\nOne useful strategy is to label this morpheme according to its position, e.g., PRVP for preverbal particle. This way, it can be easily found when gathering up other morphemes in this position, other 'pre-verbal particles', when we are ready for the next iteration of morphological analysis. Since the speaker noted that it could have to do with movement or with tense, our hypotheses are that it is a directional (DIR) or past tense (PST) marker, we can make a note of this. FLEx provides multiple note fields for such reminders and the note fields can be searched.\r\n<div class=\"textbox\">\r\n\r\na-pa-<strong>va<\/strong>-ming\r\n\r\nAGR:3SBJ-AGR:1OBJ-<strong>PRVP<\/strong>-watch\r\n\r\n\u2018He checked up on me.\r\n\r\n<\/div>\r\nIn our next pass of analysis, to make progress on this morpheme, we can use search features to:\r\n<ul>\r\n \t<li>find all items transcribed as <em>va<\/em><\/li>\r\n \t<li>find all items marked as PRVP to compare <em>va<\/em> with other markers in the same or a similar pre-verbal position<\/li>\r\n \t<li>find all items marked with DIR to compare with other suspected directional markers<\/li>\r\n \t<li>find all items marked with PST to compare with other suspected past tense markers<\/li>\r\n<\/ul>\r\n<h3 style=\"text-align: center\">Second pass analysis<\/h3>\r\nAfter considering <em>va<\/em> with other pre-verbal markers in the language, we have decided that we are confident in calling it a directional marker. From here, we revise our category to DIR, but we aren't sure what subcategory to use. Rather than leave it blank, we mark it with <em>uk<\/em> to indicate that it is unknown information.\r\n\r\nDIR:uk\r\n\r\nBetter yet, we can mark other information, such as that the morpheme indicates the associated motion is going, or indicates movement across level ground or other pieces of potentially relevant information\r\n\r\nDIR:going\r\n\r\nDIR:level\r\n\r\nWith increasing nuance, we can create more accurate glosses.\r\n<div class=\"textbox\">Activity: Thinking about a language you are currently working on, make a list of the at least 15 bound morphemes. What are the meaning of these morphemes? How would you gloss these using hierarchical glossing? What general grammatical category do they belong to? What specific semantics do they have?<\/div>\r\n&nbsp;\r\n<h2>6.4 Glossing Hakha Lai<\/h2>\r\nWith the proposed workflow in mind, let's work back through the Hakha Lai sentence previously divided into phrases in 6.2\r\n\r\n&nbsp;\r\n<div class=\"textbox\">\r\n\r\n[tii vaa poo\u014b pa khat a\u0294]\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0 [ruul lee h\u014ber\u0294 tee] \u00a0\u00a0\u00a0\u00a0 [an rak um] \u00a0\u00a0\u00a0 [an tii]\r\n\r\n[Near a river]\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0 [snake and ant]\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0 [there was]\u00a0\u00a0\u00a0\u00a0\u00a0 [they say]\r\n\r\n\u201cThere were a snake and an ant near a river, they say.\u201d\r\n\r\n<\/div>\r\nAnd lets consider each phrase individually, assuming we have a native speaker to ask questions of.\r\n<h3>\"Near a River\"<\/h3>\r\nA native speaker offers the following translations of individual words.\r\n<div class=\"textbox\">\r\n\r\ntiivaa \u00a0 <strong>poo\u014b \u00a0\u00a0 pakhat \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 a\u0294\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0 <\/strong>\r\n\r\nriver<strong>\u00a0\u00a0\u00a0 <em>near<\/em>\u00a0\u00a0\u00a0 <em>one<\/em>\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0 <em>at<\/em><\/strong>\r\n\r\n\u201cNear a river\u201d\r\n\r\n<\/div>\r\n<h3><em>tiivaa<\/em><\/h3>\r\nThe native speaker indicates that <em>tiivaa<\/em> means \u2018river\u2019, so we have combined the syllables <em>tii<\/em> and <em>vaa<\/em>. The word <em>tii<\/em> means water, but <em>vaa<\/em> has no clear and apparent meaning, so we decide it it may be related historically, but is not clearly decompositional in modern language. We combine the syllables into the same word \u00a0and gloss it together as 'river'.\r\n<h3><em>poo\u014b<\/em><\/h3>\r\nThis morpheme appears after the noun with the meaning \u2018near\u2019, but the phrase still needs the post-position <em>a\u0294<\/em>, currently translated as \u2018at\u2019. So, it is likely <em>poo\u014b<\/em> is a noun with the translation 'near', meaning something like \u2018the area near\u2019. So, we\u2019ll just translate it as \u2018near\u2019.\r\n<h3><em>pakhat<\/em><\/h3>\r\nOur free translation for this word is \u2018a\u2019, a function word in English that expresses that the noun is indefinite. Without any other information about this language, we can assume that a translation of \u2018a\u2019 is likely not appropriate based on its syntactic category and semantic meaning. To start, we do not yet know whether the language encodes definiteness at all, let alone how it is marked. Next, if we ask for other numbers, we get <em>pahnih<\/em> \u2018two\u2019, <em>pathum<\/em> \u2018three\u2019, and <em>pali<\/em> \u2018four\u2019. Since each word starts with <em>pa<\/em>-, we can hypothesize that pa- is a separate morpheme, likely a numeral classifier. We can also guess that if there is one numeral classifier, there will be others. So, we should leave room to update our annotation as our understanding of the phenomenon builds. For this one, we\u2019ll separate the word into two morphemes and gloss them:\r\n<div class=\"textbox\">\r\n\r\n<em>pa-khat<\/em>\r\n\r\nCLF:uk-one\r\n\r\n<\/div>\r\nThis option captures the information that:\r\n\r\n(a) <em>pa-<\/em> is a numeral classifier (CLF)\r\n\r\n(b) We don\u2019t know what contexts this CLF is used in, and what other CLFs we\u2019ll come across, but we want to leave a placeholder to later add that information (CLF:<strong>uk<\/strong>)\r\n\r\n(c) The numeral element <em>khat <\/em>means \u2018one\u2019\r\n<h3><em>a\u0294<\/em><\/h3>\r\nOur translation for this word is another function word \u2018at\u2019. In English, \u2018at\u2019 has really variable and idiosyncratic use (e.g. \u2018at school\u2019, \u2018at 1pm\u2019, \u2018at last\u2019), so we know a translation will not be an accurate way to represent this HL word. What we do know is that it marks a noun phrase as an adjunct (not a required part of the sentence), so we can call it an <em>oblique case marker<\/em> (OBLCM). Since, at least in this context, it is a reference to a location, we will tentatively label it \u2018locative\u2019 (LOC). So, our gloss is OBLCM:LOC\r\n\r\nHere is the new gloss:\r\n<div class=\"textbox\">\r\n\r\ntiivaa \u00a0 poo\u014b \u00a0\u00a0 pa-khat \u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0 a\u0294\r\n\r\nriver\u00a0\u00a0\u00a0 near\u00a0\u00a0\u00a0\u00a0 CLF:uk-one\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0 OBLCM:LOC\r\n\r\n\u201cNear a river\u201d\r\n\r\n<\/div>\r\n&nbsp;\r\n<h3>\"A snake and an ant\"<\/h3>\r\nA native speaker offers the following translations of individual words.\r\n<div class=\"textbox\">\r\n\r\nruul \u00a0 \u00a0 lee \u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0 h\u014ber\u0294tee.\r\n\r\nsnake\u00a0\u00a0 and\u00a0\u00a0\u00a0\u00a0\u00a0 \u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0 ant\r\n\r\n\u201ca snake and an ant\u201d\r\n\r\n<\/div>\r\n<h3><em>h\u014ber\u0294tee<\/em><\/h3>\r\nThese two syllables together are translated as \u2018ant\u2019 so we\u2019ve combined syllables in our transcription accordingly.\r\n<h3><em>lee<\/em><\/h3>\r\nThe translation we got for this word is \u2018and\u2019, which is a conjunction. Conjunctions may have more or less limited contexts where they can show up, and a language may have many conjunctions. So, rather than gloss this (and likely other conjunctions) as \u2018and\u2019, we can gloss this as a conjunction (CONJ). To note that there may be more relevant information to add to this gloss, we will add 'uk' as a subcategory and leave it to future study (CONJ:uk).\r\n\r\nHere is the new gloss:\r\n<div class=\"textbox\">\r\n\r\nruul \u00a0 \u00a0 \u00a0lee \u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0 h\u014ber\u0294tee.\r\n\r\nsnake\u00a0\u00a0 CONJ:uk \u00a0 \u00a0 \u00a0 \u00a0 ant\r\n\r\n\u201ca snake and an ant\u201d\r\n\r\n<\/div>\r\n<h3><\/h3>\r\n<h3>\"There was\"<\/h3>\r\nA native speaker offers the following translations of individual words.\r\n<div class=\"textbox\">\r\n\r\nan \u00a0\u00a0\u00a0\u00a0\u00a0\u00a0 rak \u00a0\u00a0\u00a0\u00a0\u00a0 um\r\n\r\n?\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0 ? \u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0 <em>was<\/em>\r\n\r\n\u201cthere was\u201d\r\n\r\n<\/div>\r\n&nbsp;\r\n<p style=\"text-align: center\"><em>an &amp; rak<\/em><\/p>\r\nCurrently, we do not have many clues from translation to figure out the function of these two morphemes. We can call these both pre-verbal particles (PRVP:uk) and move on.\r\n<h3><em>um<\/em><\/h3>\r\nThis verb is currently translated as \u2018was\u2019. Words translated with a form of \u2018be\u2019 are often some type of copula (COP) and we are also unsure about the representation of past tense. We also know that here it is reporting the <em>existence<\/em> of a snake and an ant at a place, so we can add that information in the subcategory gloss (COP:exist)\r\n\r\nHere is the new gloss:\r\n<div class=\"textbox\">\r\n\r\nan \u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0 rak \u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0 um\r\n\r\nPRVP:uk \u00a0 \u00a0 \u00a0 \u00a0 PRVP:uk \u00a0 \u00a0 \u00a0 \u00a0 \u00a0COP:exist\r\n\r\n\u201cthere was\u201d\r\n\r\n<\/div>\r\n<h3>\"They say\"<\/h3>\r\nA native speaker offers the following translations of individual words.\r\n<div class=\"textbox\">\r\n\r\n<strong>an<\/strong> \u00a0\u00a0\u00a0\u00a0\u00a0\u00a0 tii\r\n\r\n<strong>?<\/strong> \u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0 say\r\n\r\n\u201cthey say\u201d\r\n\r\n<\/div>\r\n<h3><em>an<\/em><\/h3>\r\nWe saw this morpheme in the previous line, but here we get the clue that it should correspond with the subject \u2018they\u2019. This lets us know that it is likely a pronoun, pronominal clitic, or agreement marker. For now, without other information about the language, we can add subject information to the subcategory gloss (PRVP:3plsubj), keeping this morpheme in the \u2018pre-verbal particle\u2019 category so it is easily compared against other PRVPs. (If we have enough information to call this, for example, an agreement marker, we could use AGR:3plsubj).\r\n\r\nHere is the new gloss:\r\n<div class=\"textbox\">\r\n\r\n<strong>an<\/strong> \u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0 tii\r\n\r\nPRVP:3plsubj say\r\n\r\n\u201cthey say\u201d\r\n\r\n<\/div>\r\n&nbsp;\r\n<h3>Working towards linguistic analysis<\/h3>\r\nHere is our updated gloss:\r\n<div class=\"textbox\">\r\n\r\ntiivaa \u00a0 poo\u014b \u00a0\u00a0 pa-khat \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0a\u0294\r\n\r\nriver\u00a0\u00a0\u00a0 near\u00a0\u00a0\u00a0\u00a0 CLF:uk-one\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0 OBLCM:loc\r\n\r\n'near a river'\r\n\r\n&nbsp;\r\n\r\nruul \u00a0 \u00a0 \u00a0lee \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0h\u014ber\u0294tee.\r\n\r\nsnake\u00a0\u00a0 CONJ:uk \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 ant\r\n\r\n'a snake and an ant'\r\n\r\n&nbsp;\r\n\r\nan \u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0 rak \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0um\r\n\r\nPRVP:uk \u00a0 \u00a0 \u00a0 \u00a0 PRVP:uk \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 COP:exist\r\n\r\n'there was'\r\n\r\n&nbsp;\r\n\r\nan \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 tii\r\n\r\nPRVP:3plsubj \u00a0 \u00a0say\r\n\r\n'they say'\r\n\r\n<\/div>\r\n&nbsp;\r\n\r\nSome of these glosses are good enough to work for us until we have more examples of the same word or other words in the same category for analysis. OBLCM:LOC and COP:exist, for example, have both a grammatical category and semantic category gloss, getting us close to the gloss we may ultimately settle on for our corpus. CONJ:uk and CLF:uk are also likely good enough until we have a lot of examples of conjunctions and numeral classifiers for a larger and more detailed look at these categories.\r\n\r\nThe gloss PRVP, however, only works as a temporary landing spot for morphemes showing up before the verb, within the same roughly segmented phrase. We can leave these and move on, but after a bit more consideration, we may be able to move these morphemes to more apt categories.\r\n<h3><em>an<\/em><\/h3>\r\nThe first problem we can see is that <em>an<\/em> is glossed as both PRVP:a and PRVP:3plsubj. In the first instance, the translation has an expletive (or \u201cdummy\u201d) subject \u2018there\u2019 and in the second instance, it is glossed as \u2018they\u2019 (referring to unknown or unspecified people). More examples will be needed to determine the grammatical category.\r\n\r\nVisit the CoRSAL website to find links to IGT examples. We invite you to try out this method with IGT at your disposal. What are the shortcomings of this method? What are the advantages?\r\n\r\nIn the next section, we will review the use of the program FLEx where annotation can be stored and revised to record the growing understanding of the annotator.","rendered":"<div class=\"textbox textbox--learning-objectives\">\n<header class=\"textbox__header\">\n<p class=\"textbox__title\">Learning Objectives<\/p>\n<\/header>\n<ol>\n<li>Learn and apply methods for lexicon creation<\/li>\n<li>Learn and apply methods for hierarchical glossing<\/li>\n<\/ol>\n<div><\/div>\n<\/div>\n<h2><strong style=\"text-align: initial;font-size: 14pt\">6.1 Glossing Standards and Conventions<\/strong><\/h2>\n<p>Transcription provides a representation of the sounds of an utterance. \u00a0Translation offers access to the information in the utterance in another language. \u00a0 Interlinear glossing provides representation of the utterance at a word and subword level.<\/p>\n<p>By dividing an utterance into morphemes and noting the meaning or function of each morpheme individually, interlinear glossing makes your data more usable researchers in building word and sentence grammar. \u00a0It also helps with creating pedagogical materials for language teaching.<\/p>\n<p>The most common conventions for interlinear glossing are the Leipzig glossing rules, available here:<\/p>\n<p><a href=\"https:\/\/www.eva.mpg.de\/lingua\/pdf\/Glossing-Rules.pdf\">https:\/\/www.eva.mpg.de\/lingua\/pdf\/Glossing-Rules.pdf<\/a><\/p>\n<p>&nbsp;<\/p>\n<div class=\"textbox shaded\">\n<h4 style=\"text-align: center\">Leipzig Glossing Rules<\/h4>\n<p>In the example below, there are four lines of text: (1) the transcription, (2) morpheme by morpheme gloss, (3) the word gloss, and (4) the free translation. Morphemes in line (1) are aligned to their glosses in line (2) and (3).<\/p>\n<p>1)\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0 Bawi=ni\u0294\u00a0\u00a0\u00a0 \u00a0\u00a0\u00a0 Chin \u00a0\u00a0 tsoosaa \u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0 a=eii-t\u0259\u0279\u0279<\/p>\n<p>2)\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0 Bawi=ERG \u00a0 \u00a0 Chin \u00a0\u00a0 beef \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 3S=eat-CAUS<\/p>\n<p>3) \u00a0 \u00a0 \u00a0 \u00a0 Bawi \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0Chin \u00a0 \u00a0 beef \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 made.to.eat<\/p>\n<p>4)\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0 \u2018Bawi made Chin eat the beef.\u2019<\/p>\n<p>Morphemes such as <em>tsoosaa<\/em> \u2018beef\u2019 and <em>eii <\/em>\u2018eat\u2019 provide <strong>content<\/strong> information to the clause - these will be nouns and verbs, or they may be adjectives and adverbs. \u00a0Here, we see them \u00a0translated into English in (3). The constituent 'made to eat' is more complicated. \u00a0It has a morpheme that relays content information (i.e. 'eat') and also morphemes that relay grammatical information \u00a0(i.e., causative and third singular, abbreviated as 'caus' and '3s'), respectively. \u00a0Content morphemes are glossed in lower case. \u00a0Grammatical morphemes, such as the ergative (erg) marker <em>ni\u0294<\/em> and the causative (caus) marker <em>t\u0259\u0279\u0279<\/em> are glossed in caps in (2). \u00a0Also, grammatical morphemes, such as <em>t\u0259\u0279\u0279<\/em> (caus) are divided from other morphemes in the same word with a hyphen (-). Clitics, such as the third person singular subject agreement marker (3s), are divided from other morphemes in the same word with an equals sign (=).<\/p>\n<\/div>\n<p>&nbsp;<\/p>\n<div class=\"textbox\">Discussion:\u00a0 What are the main principles and goals of the LGR?\u00a0 To help you with the discussion, read:\u00a0 Chelliah, Shobhana, Mary Burke, and Marty Heaton. (2021). Using interlinear gloss texts to facilitate cross-language comparison and improve language description.\u00a0 <em>Indian Linguistics<\/em>. Indian Linguistics 82(1-2) 2021: 1-24.<\/div>\n<p>&nbsp;<\/p>\n<p>Forming an analysis and applying it to data is doing the science of linguistics, and the process is <strong>recursive. \u00a0<\/strong>Data informs analysis and analysis refines data. The hardest part of approaching glossing in a new language is starting. When you have no analysis to work from, every word is a puzzle, or several related puzzles. \u00a0What we aim for is glossing that is:<\/p>\n<ul>\n<li><strong>Accessible:<\/strong> It should allow even a novice glosser to move through the text even if every morpheme cannot be completely glossed where more analysis is required.<\/li>\n<li><strong>Not relayed only by English translations: <\/strong>Translations, especially of grammatical morphemes, are likely to be imprecise and may result in unclear glosses or mistranslations in other sentences.<\/li>\n<li><strong>Complete: <\/strong>It should reflect the hypotheses developed and knowledge gained by the glosser during the course of the work. It should include both syntactic and semantic information relevant to future studies. It should reflect known information as well as unknown information transparently.<\/li>\n<li><strong>Searchable:<\/strong> Terms should appropriate for use with more than one language and represented consistently so that all instances can be easily found.<\/li>\n<li><strong>Adaptable:<\/strong> Both commonly used category labels and language specific\/researcher specific glosses should be possible.<\/li>\n<\/ul>\n<p>This set of decisions that need to made to reach such glossing conventions can be overwhelming, leaving researchers feeling like they need more training before they can even begin. \u00a0Luckily, however, we know more than we think we know and, as we discuss in the rest of this chapter, the novice glosser can use this knowledge for glossing:<\/p>\n<ol>\n<li>We know or we can learn the expected grammatical categories for a language based on related languages - \u00a0Typology can inform glossing!<\/li>\n<li>We can often guess from context the meaning of grammatical morphemes a even if we don't have a name for it - Placeholder labels are our friends!<\/li>\n<li>We can differentiate between different senses of a morpheme - \u00a0Glosses can reflect the historical development of a morpheme!<\/li>\n<\/ol>\n<p>Standards, such as Leipzig Glossing Rules (LGR), are beneficial for data consistency. \u00a0Such consistency can help with comparison of forms within a corpus and with cross-language comparison. \u00a0 However:<\/p>\n<ol>\n<li>LGR will not provide the vocabulary needed for semantic glossing of all morphemes in all languages. As you learned from the reading above, this was not the goal of LGR. So, even with LGR printed out and at your side when glossing, you will have to come up with glosses for morphemes.<\/li>\n<li><span style=\"text-align: initial;font-size: 14pt\">While a language expert, your consultant, can tell you the meanings of many content morphemes, they will likely not be able to gloss grammatical morphemes. \u00a0<\/span><\/li>\n<\/ol>\n<h2><strong>6.2 Clause and Phrase Segmentation<\/strong><\/h2>\n<p>After speech has been transcribed and translated, the next step is to segment speech into clauses, phrases, words, and morphemes. Create a guide for yourself on what you expect to find at the edge of a clause.\u00a0For Tibeto-Burman languages, this will often be a clause chaining mechanism such a nominalizer or other clausal subordinator. \u00a0Making list of members of the closed category of subordinators will help you with identifying clause boundaries.<\/p>\n<p>When using the automatic segmentation tool of SayMore or ELAN, the pauses at the edges of intonational chunks will automatically brake connected speech into what roughly corresponds to a phrasal or clausal constituent boundary.<\/p>\n<p>Native speakers will usually be more comfortable matching larger phrases between the transcription and translation than individual words and morphemes. \u00a0Here the sentence from above is segmented roughly into phrases:<\/p>\n<div class=\"textbox\">\n<p>[tii vaa poo\u014b pa khat a\u0294]\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0 [ruul lee h\u014ber\u0294 tee] \u00a0\u00a0\u00a0\u00a0 [an rak um] \u00a0\u00a0\u00a0 [an tii]<\/p>\n<p>[Near a river]\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0 [snake and ant]\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0 [there was]\u00a0\u00a0\u00a0\u00a0\u00a0 [they say]<\/p>\n<p>\u201cThere were a snake and an ant near a river, they say.\u201d<\/p>\n<\/div>\n<h2><strong>6.3 Free morphemes, Bound Morphemes, and Clitics<\/strong><\/h2>\n<p>Native speaker intuitions are key to the segmentation of utterances into morphemes. Be aware, however, that an additional challenge comes in how morphemes are represented in the practical orthography. Consider the following example from a story in Hakha Lai called \u201cThe Snake and the Ant\u201d from Roengpitya (LTBA 20.2 p.44 R. Roengpitya).<\/p>\n<div class=\"textbox\">\n<p>tii vaa poo\u014b pa khat a\u0294 ruul lee h\u014ber\u0294 tee an rak um an tii<\/p>\n<p>\u201cThere were a snake and an ant near a river, they say.\u201d<\/p>\n<\/div>\n<p>Here both lexical and grammatical \u00a0morphemes are written separately. The orthography treats these both the same even though in many cases the grammatical morphemes are bound (must occur with another form) and the lexical morphemes are free (can occur without other morphology). \u00a0 On the morpheme representation and morpheme glossing lines, we can make clear if \u00a0we are dealing with bound or free morphology. \u00a0For example, a morpheme with a dash before or after might signal a bound affix.<\/p>\n<h2><strong>6.4 Morpheme Glossing<\/strong><\/h2>\n<div class=\"textbox\">\n<p>Discussion: Once a clause is divided into phrases and you have the free translation, what next steps will you take to further divide the phrase into content and function morphemes? How will you:<\/p>\n<ul>\n<li>Discover where the head of phrase is?<\/li>\n<li>Develop a hypothesis about the order of morphemes?<\/li>\n<li><span style=\"text-align: initial;font-size: 0.9em\">Identify the edge of\u00a0 a clause or phrase?<\/span><\/li>\n<li>Determine if a morpheme is free, bound, or a clitic?<\/li>\n<li>Gloss a \u00a0grammatical morpheme?<\/li>\n<\/ul>\n<\/div>\n<h3>Identifying grammatical morphemes<\/h3>\n<p>Whether you are a speaker of the language doing the translation yourself, or you are working with a speaker to arrive at the translation you might pay attention to these types of comments:<\/p>\n<div class=\"textbox textbox--examples\">\n<header class=\"textbox__header\">\n<p class=\"textbox__title\">Morphemes that correspond to functional categories in English (pronouns, adpositions, auxiliary verbs, etc.)<\/p>\n<\/header>\n<div class=\"textbox__content\">\"That word corresponds to the <em>to<\/em> part of the phrase\"<\/div>\n<\/div>\n<div class=\"textbox textbox--examples\">\n<header class=\"textbox__header\">\n<p class=\"textbox__title\">Morphemes that require description<\/p>\n<\/header>\n<div class=\"textbox__content\">\n<p>\u201cThis word means that the action was <em>performed earlier today<\/em>\u201d<\/p>\n<\/div>\n<\/div>\n<div class=\"textbox textbox--examples\">\n<header class=\"textbox__header\">\n<p class=\"textbox__title\">Morphemes that don\u2019t change the meaning of the lexical word.<\/p>\n<\/header>\n<div class=\"textbox__content\">\n<p>\u201cThat word attached to <em>eat<\/em> means <em>the process of<\/em> <em>eating<\/em>\u201d<\/p>\n<\/div>\n<\/div>\n<div class=\"textbox textbox--examples\">\n<header class=\"textbox__header\">\n<p class=\"textbox__title\">Morphemes that don\u2019t mean the same thing when split from the lexical word.<\/p>\n<\/header>\n<div class=\"textbox__content\">\n<p>\u201cThis word means that the tiger is a woman, but it doesn\u2019t mean <em>woman<\/em> by itself.\u201d<\/p>\n<\/div>\n<\/div>\n<div class=\"textbox textbox--examples\">\n<header class=\"textbox__header\">\n<p class=\"textbox__title\">Morphemes with no clear meaning<\/p>\n<\/header>\n<div class=\"textbox__content\">\n<p>\u201cI don\u2019t know what that word means.\u201d<\/p>\n<\/div>\n<\/div>\n<p>These and other native speaker intuitions are helpful clues as to whether a morpheme is lexical or grammatical and bound or free.<\/p>\n<h3>Naming grammatical morphemes<\/h3>\n<p>Recall that in the last chapter we were offered the free translation \u201cJean could go\u201d for the French sentence below. \u00a0We asked a few questions to a native speaker, allowing us to associate English translations with each word in the French sentence.<\/p>\n<div class=\"textbox\">\n<p>Jean \u00a0\u00a0\u00a0 pourrais \u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0 aller.<\/p>\n<p>Jean\u00a0\u00a0\u00a0\u00a0 could\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0 go<\/p>\n<p>\u2018Jean could go.\u2019<\/p>\n<\/div>\n<p>While the name <em>Jean<\/em> and the verb <em>aller<\/em> \u2018go\u2019 offer more straightforward translations, <em>pourrais<\/em> is more problematic, since it is glossed with the English modal verb \u2018could\u2019 which has several possible interpretations that may or may not line up with the French word including:<\/p>\n<ul>\n<li>Ability: Jean <em>could<\/em> go to the store (before the accident).<\/li>\n<li>Possibility: Jean <em>could<\/em> go to the store (on his way home, if he thinks of it).<\/li>\n<\/ul>\n<p>Furthermore, a native speaker might tell us that <em>pourrais<\/em>:<\/p>\n<ul>\n<li>Has other forms including \u2018pouvoir\u2019 and \u2018peux\u2019 (indicating that the word has several possible forms)<\/li>\n<li>May also be translated as <em>might<\/em>, <em>can<\/em>, or <em>may <\/em>(indicating that its translation into English is contextual).<\/li>\n<\/ul>\n<p>We have now determined that glossing for <em>pourrais<\/em> is not a simple matter! \u00a0Here are the possible strategies we can follow:<\/p>\n<div class=\"textbox shaded\">\n<h4>So how do you gloss it?<\/h4>\n<p><strong>Mark it with &lt;?&gt; and deal with it later.<\/strong> This is the easiest option, but it ignores any information collected about the morpheme in the process of glossing the text. Since it ignores collected information and moves on, it allows the glosser to cover more ground faster.<\/p>\n<p><strong>Translate it with a similar word in the glossing language<\/strong> (like \u2018might\u2019 or \u2018could\u2019). The advantage of this option over (1) is that it keeps track of some information discovered in the process of glossing. This option is a workable option for lexical morphemes but ultimately confusing for grammatical ones as:<\/p>\n<ul>\n<li>It is imprecise (<em>might<\/em> can express possibility or permission)<\/li>\n<li>It assumes more analysis exists than has been done<\/li>\n<li>It assumes the translatability of grammatical morphemes between languages that may not have accurate correspondences.<\/li>\n<li>It also may lose information about the syntactic position of the morpheme since words with modal meanings may be expressed in a number of ways in a language not limited to a certain syntactic position (as adverbials or verbs with complement clauses).<\/li>\n<li>It makes the text less searchable, since someone looking to do analysis on modals would have to use a set of English modals (could, would, should, must, can, etc.) to find the data to begin analysis. In other words, it is not tagged in a way that facilitates research.<\/li>\n<\/ul>\n<p><strong>Give it a name like \u2018possibility\u2019 noting whatever your preliminary analysis is, based on context and translation.<\/strong> This option has some of the advantages of (2) in that it keeps information gained from the process of glossing. It is more precise (\u201cthis morpheme expresses possibility\u201d) than giving it a loose translation, but it suffers from the same searchability problem. If morphemes are marked with whatever the researcher feels is the closest analysis, how do you search to find all things with like distribution to further analyze the data?<\/p>\n<p><strong>Mark it as a \u2018modal\u2019 or \u2018verbal auxiliary\u2019 element<\/strong>. This option offers a broad category label and is more searchable, but loses more nuanced information about semantics gleaned from the process of glossing.<\/p>\n<\/div>\n<h3><\/h3>\n<h3 style=\"text-align: center\">General grammatical category label<\/h3>\n<p>The method we offer here to meet the challenge of \u00a0how to gloss a grammatical morpheme involves using a two part name:<\/p>\n<ul>\n<li>Part 1: \u00a0the more clearly understood information like the general category to which the morpheme belongs<\/li>\n<li>Part w: \u00a0semantic information which is the meaning relayed by the morpheme as indicated by the free translation<\/li>\n<\/ul>\n<p>For example, aspect is a general grammatical category and \u00a0progressive is specific type of an aspect morpheme. \u00a0So, the two part name would be aspect:progressive. \u00a0It may be that you know the left side of the equation or the right side or both. \u00a0You can more rapidly and confidently gloss by stating what you do know and leaving what you don't know for later. \u00a0The system proposed here allows researchers to flexibly represent their understandings of word and morpheme glosses, revising them with increasingly deeper levels of analysis. \u00a0The first step in making this method of hierarchical interlinear glossing (HIGT) work is to create a list of recurring grammatical categories. \u00a0And this is where our typological understanding of Tibeto-Burman can help!<\/p>\n<div class=\"textbox textbox--sidebar textbox--examples\"><\/div>\n<div class=\"textbox textbox--examples\">\n<header class=\"textbox__header\">\n<p class=\"textbox__title\">Examples of grammatical categories<\/p>\n<\/header>\n<div class=\"textbox__content\">\n<p>AGR:\u00a0 inflection agreement morphology for agent or patient and may vary according to person and number<\/p>\n<p>DIR:\u00a0 derivational morphology primarily indicating direction of action of motion verb<\/p>\n<p>SUB:\u00a0 subordinators affixed to finite or nominalized clauses to create subordinate clauses in clause chaining or as complements to verbs<strong>.<\/strong><\/p>\n<\/div>\n<\/div>\n<h3><\/h3>\n<h3>Specific semantic category label<\/h3>\n<p>This is where the specific instantiation of the grammatical category (e.g., subordinator, directional or agreement marker) is identified with a semantic label.\u00a0 For example:<\/p>\n<p>(1) subordinator: simultaneous<\/p>\n<p>(2) subordinator: sequential<\/p>\n<p>(3) subordinator: after acting<\/p>\n<p>The general grammatical category is subordinator. \u00a0The specific instantiations are simultaneous or sequential.<\/p>\n<p>Compare (2) and (3). \u00a0Imagine that annotator A writes (2) and annotator B writes (3). \u00a0Annotator B might simply not know the term sequential, but both annotators are referring to the same morpheme. \u00a0We can recapture at least some of that information by looking at the left side of the colon and the morpheme itself.<\/p>\n<p>It is difficult, to put it mildly, to define the semantics of a morpheme and incorporate all aspects of information about that morpheme in a gloss.\u00a0 Simply standardizing across all annotators may actually hide the nuances of meaning we gain from what is on the right side of the colon.\u00a0 So, both left and right side glossing as a pair will provide the best tracking.\u00a0 It will also allow the annotator to write down with certainty the left side while figuring out the right side.<\/p>\n<p>Here are some clues that can go into your decision on the semantic label for a morpheme.\u00a0 These sources of hypotheses are valuable and merit preserving, especially in forming early analyses.<\/p>\n<div class=\"textbox shaded\">\n<p style=\"text-align: center\"><strong>Information from existing sources on the language<\/strong><\/p>\n<p style=\"text-align: center\"><em>A previous researcher calls this a past tense marker.<\/em><\/p>\n<p>&nbsp;<\/p>\n<p style=\"text-align: center\"><strong>Contextual information from the text about position and meaning<\/strong><\/p>\n<p style=\"text-align: center\"><em>Past tense information in the translation doesn\u2019t appear elsewhere in the sentence, so it may be tied to this morpheme<\/em>.<\/p>\n<p>&nbsp;<\/p>\n<p style=\"text-align: center\"><strong>Translations, meta-information, and other intuitions offered by native speakers<\/strong><\/p>\n<p style=\"text-align: center\"><em>The language assistant said, \u2018I think that part means that it happened already.\u2019<\/em><\/p>\n<p>&nbsp;<\/p>\n<p style=\"text-align: center\"><strong>Typological and theoretical research on related phenomena<\/strong><\/p>\n<p style=\"text-align: center\"><em>Usually, morphemes with this kind of behavior are called past-tense markers in other languages.<\/em><\/p>\n<p>&nbsp;<\/p>\n<p style=\"text-align: center\"><strong>Analyses of phenomena in related languages<\/strong><\/p>\n<p style=\"text-align: center\"><em>This morpheme looks like a past tense marker in a related language.<\/em><\/p>\n<p style=\"text-align: center\"><strong>Your own intuition<\/strong><\/p>\n<p style=\"text-align: center\"><em>Based on my past experiences, this seems to be marking past tense<\/em>.<\/p>\n<\/div>\n<p>You can read more about the process of semantic glossing here: Bochnak, M. Ryan and Lisa Matthewson 2020.\u00a0<a href=\"https:\/\/linguistics.sites.olt.ubc.ca\/files\/2019\/07\/BochnakMatthewsonAnnualReviewSemanticFieldwork.pdf\">Techniques in complex semantic fieldwork.<\/a>\u00a0Annual Review of Linguistics 6:261-283.<\/p>\n<p>&nbsp;<\/p>\n<h3 style=\"text-align: center\">Updating analyses<\/h3>\n<p>Glosses are analyses, and analyses require updates. Consider the following Zophei example:<\/p>\n<div class=\"textbox\">\n<p>a-pa-va-ming<\/p>\n<p>\u2018He checked up on me.<\/p>\n<\/div>\n<p>In this example, we have identified:<\/p>\n<ul>\n<li>That the first two morphemes in this complex word are agreement markers associated with a 3rd person singular subject (AGR:3SBJ) and a 1st person object (AGR:1OBJ).<\/li>\n<li>That <em>vaming<\/em> means 'to check up on', but <em>ming<\/em> means 'to watch'.<\/li>\n<li>The native speaker doesn't have a clear intuition about what <em>va<\/em> means by itself saying it may be related to the sense that you go somewhere to check up on someone, or to the event happening in the past.<\/li>\n<\/ul>\n<p>We decide that <em>va<\/em> is a morpheme that we should gloss, but we are unsure what to call it without more data.<\/p>\n<div class=\"textbox\">\n<p>a-pa-<strong>va<\/strong>-ming<\/p>\n<p>AGR:3SBJ-AGR:1OBJ-_______-watch<\/p>\n<p>\u2018He checked up on me.<\/p>\n<\/div>\n<h3 style=\"text-align: center\">First pass analysis- placeholder for position<\/h3>\n<p>One useful strategy is to label this morpheme according to its position, e.g., PRVP for preverbal particle. This way, it can be easily found when gathering up other morphemes in this position, other 'pre-verbal particles', when we are ready for the next iteration of morphological analysis. Since the speaker noted that it could have to do with movement or with tense, our hypotheses are that it is a directional (DIR) or past tense (PST) marker, we can make a note of this. FLEx provides multiple note fields for such reminders and the note fields can be searched.<\/p>\n<div class=\"textbox\">\n<p>a-pa-<strong>va<\/strong>-ming<\/p>\n<p>AGR:3SBJ-AGR:1OBJ-<strong>PRVP<\/strong>-watch<\/p>\n<p>\u2018He checked up on me.<\/p>\n<\/div>\n<p>In our next pass of analysis, to make progress on this morpheme, we can use search features to:<\/p>\n<ul>\n<li>find all items transcribed as <em>va<\/em><\/li>\n<li>find all items marked as PRVP to compare <em>va<\/em> with other markers in the same or a similar pre-verbal position<\/li>\n<li>find all items marked with DIR to compare with other suspected directional markers<\/li>\n<li>find all items marked with PST to compare with other suspected past tense markers<\/li>\n<\/ul>\n<h3 style=\"text-align: center\">Second pass analysis<\/h3>\n<p>After considering <em>va<\/em> with other pre-verbal markers in the language, we have decided that we are confident in calling it a directional marker. From here, we revise our category to DIR, but we aren't sure what subcategory to use. Rather than leave it blank, we mark it with <em>uk<\/em> to indicate that it is unknown information.<\/p>\n<p>DIR:uk<\/p>\n<p>Better yet, we can mark other information, such as that the morpheme indicates the associated motion is going, or indicates movement across level ground or other pieces of potentially relevant information<\/p>\n<p>DIR:going<\/p>\n<p>DIR:level<\/p>\n<p>With increasing nuance, we can create more accurate glosses.<\/p>\n<div class=\"textbox\">Activity: Thinking about a language you are currently working on, make a list of the at least 15 bound morphemes. What are the meaning of these morphemes? How would you gloss these using hierarchical glossing? What general grammatical category do they belong to? What specific semantics do they have?<\/div>\n<p>&nbsp;<\/p>\n<h2>6.4 Glossing Hakha Lai<\/h2>\n<p>With the proposed workflow in mind, let's work back through the Hakha Lai sentence previously divided into phrases in 6.2<\/p>\n<p>&nbsp;<\/p>\n<div class=\"textbox\">\n<p>[tii vaa poo\u014b pa khat a\u0294]\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0 [ruul lee h\u014ber\u0294 tee] \u00a0\u00a0\u00a0\u00a0 [an rak um] \u00a0\u00a0\u00a0 [an tii]<\/p>\n<p>[Near a river]\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0 [snake and ant]\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0 [there was]\u00a0\u00a0\u00a0\u00a0\u00a0 [they say]<\/p>\n<p>\u201cThere were a snake and an ant near a river, they say.\u201d<\/p>\n<\/div>\n<p>And lets consider each phrase individually, assuming we have a native speaker to ask questions of.<\/p>\n<h3>\"Near a River\"<\/h3>\n<p>A native speaker offers the following translations of individual words.<\/p>\n<div class=\"textbox\">\n<p>tiivaa \u00a0 <strong>poo\u014b \u00a0\u00a0 pakhat \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 a\u0294\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0 <\/strong><\/p>\n<p>river<strong>\u00a0\u00a0\u00a0 <em>near<\/em>\u00a0\u00a0\u00a0 <em>one<\/em>\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0 <em>at<\/em><\/strong><\/p>\n<p>\u201cNear a river\u201d<\/p>\n<\/div>\n<h3><em>tiivaa<\/em><\/h3>\n<p>The native speaker indicates that <em>tiivaa<\/em> means \u2018river\u2019, so we have combined the syllables <em>tii<\/em> and <em>vaa<\/em>. The word <em>tii<\/em> means water, but <em>vaa<\/em> has no clear and apparent meaning, so we decide it it may be related historically, but is not clearly decompositional in modern language. We combine the syllables into the same word \u00a0and gloss it together as 'river'.<\/p>\n<h3><em>poo\u014b<\/em><\/h3>\n<p>This morpheme appears after the noun with the meaning \u2018near\u2019, but the phrase still needs the post-position <em>a\u0294<\/em>, currently translated as \u2018at\u2019. So, it is likely <em>poo\u014b<\/em> is a noun with the translation 'near', meaning something like \u2018the area near\u2019. So, we\u2019ll just translate it as \u2018near\u2019.<\/p>\n<h3><em>pakhat<\/em><\/h3>\n<p>Our free translation for this word is \u2018a\u2019, a function word in English that expresses that the noun is indefinite. Without any other information about this language, we can assume that a translation of \u2018a\u2019 is likely not appropriate based on its syntactic category and semantic meaning. To start, we do not yet know whether the language encodes definiteness at all, let alone how it is marked. Next, if we ask for other numbers, we get <em>pahnih<\/em> \u2018two\u2019, <em>pathum<\/em> \u2018three\u2019, and <em>pali<\/em> \u2018four\u2019. Since each word starts with <em>pa<\/em>-, we can hypothesize that pa- is a separate morpheme, likely a numeral classifier. We can also guess that if there is one numeral classifier, there will be others. So, we should leave room to update our annotation as our understanding of the phenomenon builds. For this one, we\u2019ll separate the word into two morphemes and gloss them:<\/p>\n<div class=\"textbox\">\n<p><em>pa-khat<\/em><\/p>\n<p>CLF:uk-one<\/p>\n<\/div>\n<p>This option captures the information that:<\/p>\n<p>(a) <em>pa-<\/em> is a numeral classifier (CLF)<\/p>\n<p>(b) We don\u2019t know what contexts this CLF is used in, and what other CLFs we\u2019ll come across, but we want to leave a placeholder to later add that information (CLF:<strong>uk<\/strong>)<\/p>\n<p>(c) The numeral element <em>khat <\/em>means \u2018one\u2019<\/p>\n<h3><em>a\u0294<\/em><\/h3>\n<p>Our translation for this word is another function word \u2018at\u2019. In English, \u2018at\u2019 has really variable and idiosyncratic use (e.g. \u2018at school\u2019, \u2018at 1pm\u2019, \u2018at last\u2019), so we know a translation will not be an accurate way to represent this HL word. What we do know is that it marks a noun phrase as an adjunct (not a required part of the sentence), so we can call it an <em>oblique case marker<\/em> (OBLCM). Since, at least in this context, it is a reference to a location, we will tentatively label it \u2018locative\u2019 (LOC). So, our gloss is OBLCM:LOC<\/p>\n<p>Here is the new gloss:<\/p>\n<div class=\"textbox\">\n<p>tiivaa \u00a0 poo\u014b \u00a0\u00a0 pa-khat \u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0 a\u0294<\/p>\n<p>river\u00a0\u00a0\u00a0 near\u00a0\u00a0\u00a0\u00a0 CLF:uk-one\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0 OBLCM:LOC<\/p>\n<p>\u201cNear a river\u201d<\/p>\n<\/div>\n<p>&nbsp;<\/p>\n<h3>\"A snake and an ant\"<\/h3>\n<p>A native speaker offers the following translations of individual words.<\/p>\n<div class=\"textbox\">\n<p>ruul \u00a0 \u00a0 lee \u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0 h\u014ber\u0294tee.<\/p>\n<p>snake\u00a0\u00a0 and\u00a0\u00a0\u00a0\u00a0\u00a0 \u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0 ant<\/p>\n<p>\u201ca snake and an ant\u201d<\/p>\n<\/div>\n<h3><em>h\u014ber\u0294tee<\/em><\/h3>\n<p>These two syllables together are translated as \u2018ant\u2019 so we\u2019ve combined syllables in our transcription accordingly.<\/p>\n<h3><em>lee<\/em><\/h3>\n<p>The translation we got for this word is \u2018and\u2019, which is a conjunction. Conjunctions may have more or less limited contexts where they can show up, and a language may have many conjunctions. So, rather than gloss this (and likely other conjunctions) as \u2018and\u2019, we can gloss this as a conjunction (CONJ). To note that there may be more relevant information to add to this gloss, we will add 'uk' as a subcategory and leave it to future study (CONJ:uk).<\/p>\n<p>Here is the new gloss:<\/p>\n<div class=\"textbox\">\n<p>ruul \u00a0 \u00a0 \u00a0lee \u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0 h\u014ber\u0294tee.<\/p>\n<p>snake\u00a0\u00a0 CONJ:uk \u00a0 \u00a0 \u00a0 \u00a0 ant<\/p>\n<p>\u201ca snake and an ant\u201d<\/p>\n<\/div>\n<h3><\/h3>\n<h3>\"There was\"<\/h3>\n<p>A native speaker offers the following translations of individual words.<\/p>\n<div class=\"textbox\">\n<p>an \u00a0\u00a0\u00a0\u00a0\u00a0\u00a0 rak \u00a0\u00a0\u00a0\u00a0\u00a0 um<\/p>\n<p>?\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0 ? \u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0 <em>was<\/em><\/p>\n<p>\u201cthere was\u201d<\/p>\n<\/div>\n<p>&nbsp;<\/p>\n<p style=\"text-align: center\"><em>an &amp; rak<\/em><\/p>\n<p>Currently, we do not have many clues from translation to figure out the function of these two morphemes. We can call these both pre-verbal particles (PRVP:uk) and move on.<\/p>\n<h3><em>um<\/em><\/h3>\n<p>This verb is currently translated as \u2018was\u2019. Words translated with a form of \u2018be\u2019 are often some type of copula (COP) and we are also unsure about the representation of past tense. We also know that here it is reporting the <em>existence<\/em> of a snake and an ant at a place, so we can add that information in the subcategory gloss (COP:exist)<\/p>\n<p>Here is the new gloss:<\/p>\n<div class=\"textbox\">\n<p>an \u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0 rak \u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0 um<\/p>\n<p>PRVP:uk \u00a0 \u00a0 \u00a0 \u00a0 PRVP:uk \u00a0 \u00a0 \u00a0 \u00a0 \u00a0COP:exist<\/p>\n<p>\u201cthere was\u201d<\/p>\n<\/div>\n<h3>\"They say\"<\/h3>\n<p>A native speaker offers the following translations of individual words.<\/p>\n<div class=\"textbox\">\n<p><strong>an<\/strong> \u00a0\u00a0\u00a0\u00a0\u00a0\u00a0 tii<\/p>\n<p><strong>?<\/strong> \u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0 say<\/p>\n<p>\u201cthey say\u201d<\/p>\n<\/div>\n<h3><em>an<\/em><\/h3>\n<p>We saw this morpheme in the previous line, but here we get the clue that it should correspond with the subject \u2018they\u2019. This lets us know that it is likely a pronoun, pronominal clitic, or agreement marker. For now, without other information about the language, we can add subject information to the subcategory gloss (PRVP:3plsubj), keeping this morpheme in the \u2018pre-verbal particle\u2019 category so it is easily compared against other PRVPs. (If we have enough information to call this, for example, an agreement marker, we could use AGR:3plsubj).<\/p>\n<p>Here is the new gloss:<\/p>\n<div class=\"textbox\">\n<p><strong>an<\/strong> \u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0 tii<\/p>\n<p>PRVP:3plsubj say<\/p>\n<p>\u201cthey say\u201d<\/p>\n<\/div>\n<p>&nbsp;<\/p>\n<h3>Working towards linguistic analysis<\/h3>\n<p>Here is our updated gloss:<\/p>\n<div class=\"textbox\">\n<p>tiivaa \u00a0 poo\u014b \u00a0\u00a0 pa-khat \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0a\u0294<\/p>\n<p>river\u00a0\u00a0\u00a0 near\u00a0\u00a0\u00a0\u00a0 CLF:uk-one\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0 OBLCM:loc<\/p>\n<p>'near a river'<\/p>\n<p>&nbsp;<\/p>\n<p>ruul \u00a0 \u00a0 \u00a0lee \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0h\u014ber\u0294tee.<\/p>\n<p>snake\u00a0\u00a0 CONJ:uk \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 ant<\/p>\n<p>'a snake and an ant'<\/p>\n<p>&nbsp;<\/p>\n<p>an \u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0 rak \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0um<\/p>\n<p>PRVP:uk \u00a0 \u00a0 \u00a0 \u00a0 PRVP:uk \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 COP:exist<\/p>\n<p>'there was'<\/p>\n<p>&nbsp;<\/p>\n<p>an \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 tii<\/p>\n<p>PRVP:3plsubj \u00a0 \u00a0say<\/p>\n<p>'they say'<\/p>\n<\/div>\n<p>&nbsp;<\/p>\n<p>Some of these glosses are good enough to work for us until we have more examples of the same word or other words in the same category for analysis. OBLCM:LOC and COP:exist, for example, have both a grammatical category and semantic category gloss, getting us close to the gloss we may ultimately settle on for our corpus. CONJ:uk and CLF:uk are also likely good enough until we have a lot of examples of conjunctions and numeral classifiers for a larger and more detailed look at these categories.<\/p>\n<p>The gloss PRVP, however, only works as a temporary landing spot for morphemes showing up before the verb, within the same roughly segmented phrase. We can leave these and move on, but after a bit more consideration, we may be able to move these morphemes to more apt categories.<\/p>\n<h3><em>an<\/em><\/h3>\n<p>The first problem we can see is that <em>an<\/em> is glossed as both PRVP:a and PRVP:3plsubj. In the first instance, the translation has an expletive (or \u201cdummy\u201d) subject \u2018there\u2019 and in the second instance, it is glossed as \u2018they\u2019 (referring to unknown or unspecified people). More examples will be needed to determine the grammatical category.<\/p>\n<p>Visit the CoRSAL website to find links to IGT examples. We invite you to try out this method with IGT at your disposal. What are the shortcomings of this method? What are the advantages?<\/p>\n<p>In the next section, we will review the use of the program FLEx where annotation can be stored and revised to record the growing understanding of the annotator.<\/p>\n","protected":false},"author":19,"menu_order":6,"template":"","meta":{"_monsterinsights_skip_tracking":false,"_monsterinsights_sitenote_active":false,"_monsterinsights_sitenote_note":"","_monsterinsights_sitenote_category":0,"pb_show_title":"on","pb_short_title":"","pb_subtitle":"","pb_authors":[],"pb_section_license":""},"chapter-type":[],"contributor":[],"license":[],"class_list":["post-103","chapter","type-chapter","status-publish","hentry"],"part":3,"_links":{"self":[{"href":"https:\/\/openbooks.library.unt.edu\/sourcetoanalysis\/wp-json\/pressbooks\/v2\/chapters\/103","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/openbooks.library.unt.edu\/sourcetoanalysis\/wp-json\/pressbooks\/v2\/chapters"}],"about":[{"href":"https:\/\/openbooks.library.unt.edu\/sourcetoanalysis\/wp-json\/wp\/v2\/types\/chapter"}],"author":[{"embeddable":true,"href":"https:\/\/openbooks.library.unt.edu\/sourcetoanalysis\/wp-json\/wp\/v2\/users\/19"}],"version-history":[{"count":40,"href":"https:\/\/openbooks.library.unt.edu\/sourcetoanalysis\/wp-json\/pressbooks\/v2\/chapters\/103\/revisions"}],"predecessor-version":[{"id":337,"href":"https:\/\/openbooks.library.unt.edu\/sourcetoanalysis\/wp-json\/pressbooks\/v2\/chapters\/103\/revisions\/337"}],"part":[{"href":"https:\/\/openbooks.library.unt.edu\/sourcetoanalysis\/wp-json\/pressbooks\/v2\/parts\/3"}],"metadata":[{"href":"https:\/\/openbooks.library.unt.edu\/sourcetoanalysis\/wp-json\/pressbooks\/v2\/chapters\/103\/metadata\/"}],"wp:attachment":[{"href":"https:\/\/openbooks.library.unt.edu\/sourcetoanalysis\/wp-json\/wp\/v2\/media?parent=103"}],"wp:term":[{"taxonomy":"chapter-type","embeddable":true,"href":"https:\/\/openbooks.library.unt.edu\/sourcetoanalysis\/wp-json\/pressbooks\/v2\/chapter-type?post=103"},{"taxonomy":"contributor","embeddable":true,"href":"https:\/\/openbooks.library.unt.edu\/sourcetoanalysis\/wp-json\/wp\/v2\/contributor?post=103"},{"taxonomy":"license","embeddable":true,"href":"https:\/\/openbooks.library.unt.edu\/sourcetoanalysis\/wp-json\/wp\/v2\/license?post=103"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}