It's been over two years since the release of Toaq Delta—practically an era in the scale of this conlang's history. I'd like to take a moment to look back at everything that's happened with the language in this time, and provide some thoughts on the direction it could take next.
Toaq has grown in nearly every respect: the lexicon, the corpus, the tooling, the lore (!?), our ability to speak it, our understanding of its semantics… and it's certainly grown in our hearts as well. I believe for all of us who participate in toaqdiu, it's been quite the undertaking to wrap our heads around the new perspective of reasoning about a loglang in terms of generative grammar. But it's also been rewarding to see all the new insights that come of this, and generally how much more natural speech has become. Toaq suddenly felt grown up, and lots of us have been excited to finally call Delta "the stable release".
The grammar we have now succeeds at explaining a good deal more than the greedy, non-linguistic PEG grammars of the past. At the same time—at least for me—getting in the mindset of language being generated by simple compositional rules and syntactic operations has left me wondering, what would it take for the grammar to explain everything? Speakers and learners alike keep getting into discussions about why underfilling, subclauses, and sentence fences are the way they are currently. At the heart of the matter: greediness and default behaviors.
Greedy parsing, in which a single production rule consumes as much input as possible until it becomes saturated, seems to be an inherently non-linguistic mechanism. It's telling that you can transform any ambiguous grammar into an unambiguous grammar by deferring to greediness—you will also rule out many potentially useful parses along the way, which speakers' minds will likely still be tempted to generate. In the case of trailing adjuncts in subclauses, Hoemai has suggested that we might be able to invent a linguistic explanation for why only the greedy parses are generated, but this has remained elusive.
If the goal is for Toaq to feel as much like a natural language as possible while still retaining monoparsing, I think it would be a worthwhile exercise to fully give up greediness, and see what lies beyond. That is, take the Toaq we have today, blow it up (allow it to have syntactic ambiguity), and then explore what plausibly linguistic means are available to return it to monoparsing. As it turns out, it's more than just subclauses that would need rethinking.
But all that about monoparsing aside, there are also some explanatory gaps in the semantics of Toaq that have been on my mind recently, as well as some general opportunities to make the language more regular or expressive in fun ways, to make provisions for the ways in which we've naturally come to use it. Some of these I've already started to explore on the wiki, others I've been waiting until the right time to present.
Ultimately, the ideas piled up to such a height that they began to form something greater—a vision of a Toaq Epsilon. Now, this isn't my conlang, but I feel compelled to at least share the discoveries I've made. I hope you feel ready to explore such a vision, too. So, consider this a bit of fan-fiction.
Meet the tones
A common theme behind some of the ideas I'll be presenting is to optimize for prosody. We often speak of optimizing for brevity, making common constructs require few syllables, but in the bigger picture of how language is used, brevity really doesn't deserve as much importance as we lend it. Favorable prosodic qualities are what make a construct stick in the mind and roll off the tongue, even if it takes many syllables.
On that note, I really like what the tone sandhi rules in Toaq Delta do for prosody. They allow some words to be pronounced entirely without stress, keeping them lightweight. However, I often find myself wanting to pronounce words without stress in situations that the segmentation rules don't yet accommodate for. I would like to explore an alternative set of rules in which unstressed pronunciations are more universally available.
Let's imagine a world in which there are 3 tones: falling, rising, and hiatus. Each of these has a stressed and an unstressed realization. The low glottal tone isn't actually gone, rather, it's been merged with the hiatus tone to become its unstressed variant.
These unstressed realizations are available on any single-raku word, no matter what tones surround them. Tone contours still extend across the entire word as in New Segmentation, but multi-raku words are additionally characterized by their tone contour plateauing off at the middle of the vocal range.
I particularly like that this allows object incorporation and common adverbs to become more lightweight, and for determiners following the verbal complex to remain unstressed. Here's an example of the tones in practice:
It's like a blend of New Segmentation and old Toaq Beta segmentation that plays to the strengths of each: each word always bears a continuous tone contour, but so can many sequences of words. It's made possible by the small number of phonemic tones. I've been calling it Saqseq ("Three-seg").
Blowing things up
Next let's look at sentence segmentation.
A few words that were previously semi-optional (required whenever greedy parsing would otherwise "do the wrong thing") become fully optional. This includes:
The complementizer ꝡa
The speech act particles da and móq
The adjective head kı-
Or at least, they become as optional as morphlogical constraints will allow them to. Namely, if a tonal morpheme wants to attach to ꝡa, this may force it to become overt. On paper this kind of change may look small, but I suspect it could actually have a huge impact on how we learn and speak Toaq—no longer do you need to question, for every length of text you produce, "will the parser agree with the syntax I intended?" but rather it's enough to know that each constituent was produced according to the correct grammar rules, and each word is morphologically sound.
So how could we make Toaq self-segmenting with respect to sentences and adjectives? They need to withstand the worst case scenario in which everything is covert. Buckle up, because this affects quite a few of Toaq's features.
Underfilling
First of all, we'll need to disallow underfilling of verbs, including those used as nouns, adjectives, or adverbs. I see this as a worthwhile change in itself as it encourages us to be more critical about how many 'places' verbs have and speak in a manner that relates arguments more clearly to the rest of the sentence. (Consider sá jı̣paı / tú sạpaı.) Notably, any verbs that we currently consider optionally ditransitive would become transitive verbs, taking adjuncts instead of indirect objects.
Luı do jí tú, ꝡê deq do jí hóa.
I've given everything I can.
Do jí kúe cû báq jı̣paı.
I gave the book to a friend of mine.
Jua hú. Duashao jí, ꝡá luı faq hí raı.
That's strange. I wonder what happened.
The last example segments in this way because this is the only grammatical way to analyze it. Placing the period after duashao instead would strand the arguments jí and ꝡá luı faq hí raı on the right, which do not contribute to a grammatical sentence. Not to mention, duashao would lack an object.
*Jua hú duashao. Jí, ꝡá luı faq hí raı.
This is the critical difference in moving away from greedy parsing: at any position in the text, the parser may be in multiple alternative states (for example, a state which expects hú to have a complement and another which expects hú to stand on its own), but no more than one of these states may represent a finished syntactic structure. Earley parsers are cut out for exactly this job.
Serials
Next, we should require any verbs that form serials with their subject (by which I mean, the serial frames 0 and 1 if you believe in 1, such as du, kuqnu, fuı), to mandatorily form serials when used as nouns, adjectives, or adverbs.
Paı nháo báq du req, ꝡê ıu daragoq hóa. He suoıdeq hó.
She's friends with someone who appears to be human but is actually a dragon. He (the dragon) has amazing abilities.
*Paı nháo báq du.
This part is a little lacking in independent motivation, but you could rationalize it as: nouns need to supply the DP with an animacy class, just as adjuncts need to supply an adjunct class (eventive vs. subject-sharing). Verbs that form serials with their subject always defer to the inner verb for such features; therefore they need an inner verb.
Objects
Object incorporation also causes problems for sentence segmentation as it is currently. I suggest we distinguish fronted objects at the clause level from object incorporation in nouns/adjectives by marking these with different tones. The hiatus tone is used for fronted objects, while the falling tone performs object incorporation in DPs.
Daragoq hú. Paı lô chıampoq geo hú.
He's a dragon. He's friends with the old sorceress.
Daragoq hú paı lo chıampoq. Geo hú.
That friend of the sorceress is a dragon. He's old.
Naı chum joaı bâq kurı jí.
I'm looking for berries.
Ma zao súq ké poq hıq reum nanı gıaq?
Do you know the person who just performed that song?
You can remember this by the mnemonic that the hiatus tone attaches things to clauses (adverbs), while the falling tone attaches things to nouns (adjectives).
Meanwhile, predicatizers as a part of speech are removed. They become normal verbs which may or may not be used with object fronting.
Mea súq nhána.
Mea nhâna súq.
You are one of them.
Adjectives
It's problematic for sentence segmentation if both the verbal complex and nouns can have adjectives. And for that matter, it's been hard to find a syntactic explanation of how adjectives would attach to the verbal complex in the first place. Luckily, there's a very similar construct that says hello: subject-sharing adverbs.
The motto of this Toaq is that adjectives and adverbs are all simply adjuncts. That is, there is a logical equivalence between the following sentences.
Coq ní kusera báq deo nıo bohıaq.
Coq ní kusera báq, ꝡê deo nîo bôhıaq hóa.
This painting depicts a young child in poverty.
Mea nhâna báq he suaq de.
Mea nhâna báq, ꝡê he suaq dê hóa.
Among them are beautiful singers (those who sing beautifully).
Because adverbs provide the very same semantics as adjectives, we don't need the verbal complex to admit adjectives at all. And now that we don't need kı- to act as an adjective head, we're free to repurpose it for something else. I suggest we use it to force an adjunct to behave as a subject-sharing adjunct.
Bo jí báq dote shuı.
I have something which is secretly a gift.
Bo jí báq dote kı̣shuı.
I have a secret gift.
Adjuncts
Finally, we need to change the form of fronted adjuncts to avoid the kind of ambiguity seen in the following sentence structures.
Hao nháo râo ní, ꝡê faq hóa. Jôı nánı nâ hao ráı.
Hao nháo. Râo ní, ꝡê faq hóa jôı nánı, nâ hao ráı.
Two facts come together to suggest the solution: fronted adjuncts are mostly useful for prepositions carrying heavy objects, and unlike other constructs that use nâ, fronted adjuncts don't trigger a resumptive hóa. Perhaps it's really the object that triggers the fronting in the first place, and the preposition is simply along for the ride due to 'pied-piping'. And perhaps the nâ in this construction is secretly the adjunct head.
Bıe hû tue mabala po ke tuchao nâ, chuq tûı jí báq chuhaq.
After that horrible ordeal with the bus, I sat down and had some lunch.
The object of the preposition takes the hiatus tone, since this fronting process is assumed to work in a similar manner to object fronting in clauses. Fronted adverbs don't exist because they have no object that could actually trigger the fronting.
*Shuı nâ toaqpoq nháo.
(Intended) She's secretly a Toaqist.
Shrinking things down
Whew, that was a lot. By making the grammar more strict we've got sentence boundaries under control. But now some guy named H. P. Grice is unhappy with us. He's on our case with his "conversational maxims" and keeps going on about the importance of brevity. Sigh…
To appease him we should bring underfilling back, but in a manner more fitting of this stricter grammar—and ideally with clear semantics as well. One common use case for underfilling is what Lojban calls 'observatives', in which all verbal arguments are understood to be existentially quantified. Let's use the word ía for this, as a pro-form which takes the place of all verbal arguments:
Arane ía ꝡo!
Watch out, there's a spider!
Sho dua jí, ꝡá choa ía nîe ké nıakua.
I learned that there are voices coming from the cellar.
Another thing we should consider is making a formal provision for sentence fragments: noun forms, verb forms, and adjuncts that stand on their own. You probably encounter these all the time on signs and in response to questions. Considering Toaq's goals, it would be ideal if these fragments were explicitly marked so that they segment themselves from sentences in speech, but with minimal overhead.
Let's use another tonal morpheme for this: ıa (falling tone). In many cases it simply fuses with another word so that the explicit marking of fragments "costs nothing".
Chum hı̣tao súq móq?
What are you up to?
Ia haqbaı da.
Making food.
Ia ké fuaı bẹkaıdıu → Ke fuaı bẹkaıdıu
The Bureau of Orthography
Ia jóao lîa híu. → Joao lîa híu.
By the river, mainly.
Bo súq tíopuı nıaq?
How old are you?
Jeojuı ıa gúheısaq. → Jeojuı guheısaq.
Almost twenty-three.
Jeo ıa pápa. → Jeo lo papa.
Dad does.
Remember how I said nâ is an adjunct head? Well, that shows up here too.
Ia hûıneq. → Na huıneq.
Unfortunately.
Syntactically, ıa takes its complement to a complete clause, meaning it can appear with various speech act particles and even be embedded in a content clause. Semantically, it's similar to hao in that it invokes the notion of a salient predicate.
Information structure
The biggest question still is what to do with subclauses. Banning underfilling has solved one of the major challenges, but without the explanatory power of greedy parsing, we have no reason why the trailing adjuncts in subclauses couldn't just as well belong to the outer clause.
But consider the following:
We already place most subclauses at the end of their containing clause.
A sensible place for the outer clause's trailing adjuncts to go is just before the subclause.
Whenever subclauses are center-embedded, this can make your sentence harder for listeners to parse due to its unbalanced prosodic structure.
We can use fronting to improve the prosodic structure of sentences with center embedding, but it would also be nice to have a way of delaying heavy constituents by extraposing them.
In languages such as Dutch and German extraposition is in fact mandatory, at least for content clauses, no matter how heavy the constituent actually is.
I think these points all lend support to mandatory fronting or extraposition as a viable solution for subclauses. So, what would that look like?
You may have already noticed that I'm using the rising tone (ꝡá) for content clauses and the hiatus tone (ꝡê) for relative clauses. I understand these tones as coming from the tonal morphemes có and sû, which allow any CP to behave as a verbal argument or a restrictive clause, respectively.
Dua jí, ꝡá dua jí sía raı.
I know that I know nothing.
Tısha jí ké jearıaq, ꝡê luı raqkuq hú paı jı hóa.
I arrived at the market which that friend of mine had mentioned.
Now we have to be mindful of the position of each subclause within its parent clause. As long as a subclause appears at the very end, as in the above examples, everything's fine; the sentence is prosodically balanced. But whenever a subclause is followed by another constituent, prosodic constraints will motivate the clause to become extraposed. This causes the word có or sû to be stranded in the clause's original position, detached from the complementizer.
Nheje, ꝡá pu buıfa nháo gúa, râo fíachaq. → Nheje có râo fíachaq, ꝡa pu buıfa nháo gúa.
It became known yesterday that he had left the country.
Tısha báq poq, ꝡê bu zao jí hóa, gúaırıaq. → Tısha báq poq sû gúaırıaq, ꝡe bu zao jí hóa.
Some people showed up at the office who I didn't know.
This has an interesting effect on how information is structured in Toaq: rather than a tree of clauses, sentences more often resemble a linear sequence of clauses in which lightweight particles specify how each clause is related to the next. You get the big picture first, and details later.
In the rare case that a clause contains multiple subclauses, none of which are fronted or in final position, the subclauses will all be extraposed opposite their order of appearance. I really don't have a good example of this because, given all the alternative strategies available to the speaker, it shouldn't normally come up.
Kuq dûa, ꝡá bu toa Tôaq shú «cecuo», sá poq, ꝡê sıom hóa Tóaqzu, shú «cecuo». → Kuq dûa có sá poq sû shú «cecuo», ꝡe sıom hóa Tóaqzu, ꝡa bu toa Tôaq shú «cecuo».
Knowing that 'cecuo' was not a Toaq word, a person who was learning Toaq spoke the word 'cecuo'.
At any rate, this sentence structure would be dispreferred to one that uses fronting and puts the preposition at the end, due to the poor locality of the extraposed phrases.
Sá poq, ꝡê sıom hóa Tóaqzu, nâ kuq hó shú «cecuo» dûa, ꝡá bu toa Tôaq shú «cecuo».
A person who was learning Toaq spoke the word 'cecuo', knowing that 'cecuo' was not a Toaq word.
The whole concept of mandatory extraposition requires an shift in perspective: one in which Toaq's prosody abides by formal rules and plays a role in deciding whether a sentence is grammatical. Here I've hand-waved those rules, but it could be a worthwhile task for a future toaqdıuche to try to characterize them in more general terms, like linguists have done for German.
Demystifying complementizers
I like that by deferring to the concept of tonal morphemes, we can clear up the relationship between main clause ꝡa and subclause ꝡá/ꝡâ: it really is the same complementizer with the same meaning, only it's dressed up in a tonal outfit.
But with other complementizers, questions remain. How can ma play the syntactic role of a complementizer while also serving as a polarity head? The answer, I think, is that it is only a complimentizer on the surface. On closer examination, the complementizer is really a covert ꝡa onto which ma has moved.
That should come as a surprise: it is well established that Toaq is a wh-in-situ language, so this can't be an instance of wh-movement. The explanation that remains is that this is simply normal head movement occurring to satisfy a phonological constraint. Everything from Asp on up moves to SpecCP in order to make it pronounceable, for the sake of tonal morphemes that want to attach to it.
This movement happens before V-to-T movement, which is why ꝡa must become overt when all of Σ, T, and Asp are missing; V itself never ends up in SpecCP.
*Feq jí, jáı súq.
Feq jí, ꝡá jaı súq.
I sense that you are happy.
So, this theory suggests that ma should simply be understood as a polarity head, and that the tonal morphemes có and sû can attach directly to the other Σ, T, or Asp words that might start a content clause.
Ma dua súq, má Juqguo ní hao?
Do you know whether this thingy is Chinese?
Dua súq, jía duaı tao súq hí raı nha.
You know what you must do.
I originally thought that this same analysis could apply to tıo, that degree heads could move up to SpecCP as well. But, it turns out this would generate an ambiguity between adverbial adjuncts and restrictive content clauses:
Hao ní raı, ꝡê hao râq hóa [ké juna, [CP jâq raq hú gı ní raı]].
Hao ní raı, ꝡê hao râq hóa [ké juna] [AdjunctP jâq raq hú]. Gı ní raı.
So I suggest we stop considering tıo a complementizer and instead require it and other degree heads to appear in-situ within clauses. This issue does not apply to Σ, T, or Asp because these are not generated on adjuncts.
*Buaq gaı jí, tío chuqkuaı jí.
Buaq gaı jí, ꝡá tıo chuqkuaı jí.
I failed to notice how hungry I was.
*Zaı jí, túao ceaı súq.
Zaı jí, ꝡá tuao ceaı súq.
I hope you suffer little.
There is still one more complementizer that I'd like to reanalyze: la. So far, we've understood it as the complementizer that associates with the gap pronoun já to create properties.
Naı chum leo jí, lá nuo já.
I'm trying to sleep.
He nıeq jí, lá deoq já báq nô Tókıpona.
I'm bad at communicating in Toki Pona.
A curious fact about this sort of property is that it does not accept a tense. It must remain tenseless.
*Naı chum leo jí, lá jıa nuo já.
*He nıeq jí, lá naı deoq já báq nô Tókıpona.
We could explain this fact if we understood la to be a tense gap. That is, a tense head that allows verbs selecting for properties to evaluate them with respect to any time interval of their choice.
There is a related class of verbs that seem to want to quantify over time intervals. With a tense gap at our disposal, I can see these verbs as a use case for content clauses that contain la but not já.
Chaqtu, lá chuq jí báq beıgo.
I eat bagels every day. (It's a daily occurrence for me to eat bagels.)
Thinking with adverbs
There's a class of words that arise naturally in human language, and which we've been talking about for some time, but which remain absent from most of Toaq's documentation: adverbs of quantification. In English, these include words like generally, sometimes, often which appear to quantify over "instances" of something being true; whether by quantifying over points in time or some other variable.
I would like to give concrete syntax to these words. Like modals, adverbs of quantification accept an antecedent and a consequent, and like modals, it's often useful to elide the antecedent. So the choice is obvious: modals and adverbs of quantification should share the very same syntax.
When used without an antecedent, a modal or adverb of quantification simply appears in the falling tone. It takes scope over everything that follows.
Daı sho dua súq sá nıq raq suq.
You could discover something new about yourself.
Leı kuao báq taqchao.
Cars are rarely turquoise.
Notice that the last example means something like "few cars are turquoise" rather than "at few points in time are cars turquoise"; leı does not inherently deal with time. Whenever a generic reference (báq) appears under an adverb of quantification, the adverb associates with it, meaning, it becomes part of what the word quantifies over.
This is how we get the so-called "individual-level" readings of generics; these are always created by way of he associating with them, not by some inherent property of the verb.
Ceaı báq poq.
People (some people) suffered.
He ceaı báq poq.
People experience suffering.
To get a reading in which an adverb of quantification actually quantifies over points in time, you must supply a generic tense. By a beautiful coincidence, Toaq already has a set of tense words which fit this role perfectly: jela, mala, and sula.
Sula ruıq jí, má sanhe gı hú sıo.
At times I wonder whether the idea is any good.
Faı sula ruıq jí, má sanhe gı hú sıo.
I often wonder whether the idea is any good.
To supply a modal or adverb of quantification with an antecedent we use the structure ꝡâ … nâ. Any generic references bound in the antecedent will also be in scope in the consequent.
She ꝡâ bo báq haqpaoche báq aqshe nâ, kıaı háqpaoche áqshe.
If a farmer owns a donkey, they care for it.
He ꝡâ nunhe choaınuı báq fua nâ, nheje tuaobu poaqfuı máq.
Generally when furniture is that cheap, it turns out to be quite brittle.
In the case of generic tenses, they bind the tense pronoun la.
Tuguo sûla kaq ásu nánı gochıq nâ, la shoı côm hó ásu.
Whenever the dog sees that cat, she (in that moment) barks at it.
Kinds aren't real (they can't hurt you)
Speaking of generics, ever since their inclusion in the language they've attracted questions like, why speak in terms of "kinds" rather than concrete things? and what scope does báq take? Let's take a moment to clarify exactly what generics represent and how they scope. Or at least my perspective on the matter—I think it's slightly different than Hoemai's.
First of all, I think it's a misconception to say that in Toaq, we speak in terms of kinds. Generics are not part of the ontology of human language in the same way that people, ideas, and events are. Rather, generics are a syntactic phenomenon which get almost entirely "optimized away" by the semantics module. Saying that báq gochıq refers to cat-kind is like saying that sía gochıq refers to cat-lessness; the former is useful for talking about the general tendencies of cats just as the latter is useful for constructing meanings which involve a lack of cats, but ultimately, both expressions refer to cats.
Let's look once more at a couple of the earlier examples.
Ceaı báq poq.
People (some people) suffered.
He ceaı báq poq.
People experience suffering.
It appears that the default interpretation of báq (when nothing interacts with it) is an existential quantifier, just like sá. However, its defining feature is that adverbs of quantification can repurpose it to behave like another, more exotic quantifier.
This begs the question: if all báq can do is behave like other quantifiers, then why have it at all? I could appeal to naturalism, the convenience of conditional sentence structures, and how you don't want to double-stack modals, but it's true that báq should have some unique use cases that it fills as well.
One such use case is the so-called "kind-level reading" created by certain verbs which select for generic references.
Tıfaı báq gochıq.
Cats are widespread.
Eqsıa báq teasaora.
Dinosaurs are extinct.
These can normally be paraphrased in terms of some other verb + an adverb of quantification, suggesting that adverbs of quantification are an important part of the vocabulary formation process.
Faı tı bâq rıaq báq gochıq.
Cats are often present at places.
Guosıa jıqshue báq teasaora.
In no case is a dinosaur extant.
However, I think there is still one more use case which báq is uniquely positioned to cover. It comes from the motivating example of cross-sentence anaphora:
Pu tısha báq poq nıq pátı. Ma gaı súq hó móq? Geı hó báq chea de. Chéa bî namako kú báq rua.
Someone new showed up at the party. Did you see her? She was wearing a beautiful hat. The hat was adorned with flowers.
I would like this to have a reading in which there is actually a formal binding relationship between báq poq and hó / báq chea and chéa, so that mechanical parsers have a real shot at understanding entire discourses. Unlike sá, whose scope is contained to scope islands (the clause it's in), báq already appears to sometimes "scope out" of clauses in order to meet up with an adverb of quantification. So, we could imagine that, as long as nothing like he or bu intervenes, báq could take widest possible scope and even scope above the current sentence.
Shoı ké asu. Aona sá poq. Jara hó.
The dog barks. Some people approach. It (the dog) is running.
Shoı ké asu. Aona báq poq. Jara hó.
The dog barks. People approach. They (the people) are running.
In other words, báq would use something like the following algorithm when deciding where to take scope:
If the verb selects for generics, it associates with the verb.
If there is an adverb of quantification above it in the same clause, it associates with the adverb of quantification.
If there is another scope-sensitive word above it, like bu or tú, it scopes under it as an existential quantifier.
If none of these are the case, then it scopes out of the clause and repeats steps 1-3 in the outer clause.
This gives báq a purpose of its own, even when it could be paraphrased by sá: it takes potentially wider scope and so may be a more convenient option for saying certain things.
She ꝡâ huo jí ké due nâ, ruaq tısha súq báq poq pátı rû teqga súq, má gaı jí póq.
If I heard that right, you said someone showed up at the party and asked whether I saw her.
She ꝡâ huo jí ké due nâ, sá poq nâ ruaq tısha súq póq pátı rû teqga súq, má gaı jí póq.
If I heard that right, there was someone such that you said she showed up at the the party and asked whether I saw her.
He bu sha faq báq ẹtaı cîa báq elụıbuaq.
Success generally doesn't come without some instances of having failed.
Manipulating scope
In Toaq, scope is fully determined by the words you choose and the order in which you say them.
Shoe jí, ꝡá kuq súq sía raı.
I allow you to say nothing.
This means that when you want inverse scope, or something to scope out of a scope island, you need to somehow invert the order of words in syntax. For example, we can use the cleft verb nâ to front things.
Sía raı nâ shoe jí, ꝡá kuq súq hóa.
There is nothing that I allow you to say.
But this sentence now requires some forethought to produce; even though the quantified expression sía raı is only used in the object of shoe, you have to introduce it all the way at the start of the sentence.
I think we could improve on this by changing how the word nâ works. Rather than understanding it as a light verb which takes an argument on the left and a clause referencing hóa on the right, we could reframe it as simply a "pendent phrase" which allows a noun form to be placed freely within a clause, to be referred to later.
Shoe jí, sía raı nâ, ꝡá kuq súq ráı.
I allow—for no value of X—you to say X.
Specifically, I think it makes sense to allow nâ phrases to go anywhere a focus particle could go—at the start of the clause, before a verbal argument, or before an adjunct. And for that matter there are some other constructs that could benefit from having similarly free placement: modals, adverbs of quantification, and polarity heads.
Pua tú poq bu, lá poanıe já ké jıo.
Everybody did not enjoy being trapped in the building.
Cho súq hí mea nı?
Which of these do you like?
Cho jí… he tó báq bu bụqchoaısao. Tıu cho jí sía mea nı.
I like… only ones that aren't super expensive, generally. So I don't like any of them.
Jea súq sá loqpıma rú, she ꝡâ rıa ké rıaq po hu nueqche nâ, máo sá guobenueq ba.
Buy some chilies and, if that butcher's shop is open, some beef too.
This is effectively welcoming some "high adverbs" back into the post-field now that I'm aware of a semantic framework (Effects) that's prepared to explain their syntax and how they come to have linear scope. They're nice to have as a cheap alternative to fronting and a way to reframe a sentence in afterthought / on only one branch of a conjunction.
More generally… I'm of the opinion that we should be able to front nearly any part of speech if it can possibly have side effects. This is especially relevant if one adopts No Scope Creep. So what if we enabled nâ to work with a whole array of new parts of speech? (Warning: examples get progressively spicier.)
Tıo nâ mıu hụna le súq, ꝡá jıa faq nénı?
How likely do you think that is to happen?
Tíopuı nâ ruaq súq, ꝡá bo hó hụ́na gochıq?
How many cats did you say he has?
Ma nâ dıe lâojaq súq, ꝡá hụna buıfa nháo gúa?
In the end did you suggest that he leave the country, or not leave the country?
Bẹıbu nâ juoq hụna jea úmo hú meaq.
No, we shouldn't buy the boat.
Rî nâ chı súq kú, ꝡá zaıpoq jí hụ̂na ponapoq nháo?
Was it that you believed me to be a Tokiponaist, or her an Esperantist?
I don't know about you, but these constructions just feel so fresh and befitting of Toaq.
The middle path
Conjunctions are another construct whose grammar comes into question if we give up greediness. How do we know what precedence conjunctions, focus words, etc. take relative to each other? How do we manipulate that precedence when we inevitably want these constructs to group differently? The concept in Delta of inflecting conjunctions with tones to vary their precedence is neat for how resourceful it is, but it seems fundamentally designed for the model of a greedy parser that wants to know how far to backtrack. It's hard to imagine using rû fluently in speech.
The basic problem with afterthought coordination is, while it's usually easy to identify the conjunct on the right (thanks to head-initial grammar), arbitrarily many candidates for a conjunct can "stack up" on the same morpheme boundary on the left.
[sá po [sa po [sa po [nı]]]] rú (?) nánı
One way to disambiguate this is to give the conjunction something it expects to see on the left, a morpheme anchoring it unambiguously to one of the possible conjuncts. This is how you get forethought coordination like "both X and Y". But, forethought constructions can feel inconvenient if you're forced to use them in the common case. Maybe we can strike some kind of compromise between forethought and afterthought, and apply it uniformly?
Introducing middlethought coordination. Okay I know that sounds silly, but bear with me. The key insight is that with conjunctions we have two degrees of freedom: the placement of the conjunction, and the tone it carries. By their powers combined, we can unambiguously specify what's being coordinated with what.
Taı hao [jí rú] [súq] ba.
May you and I succeed.
In the rising tone a conjunction connects two noun forms. So far, this looks familiar. Now what if the left conjunct contains an incorporated object?
Zao jí sá poq che guaı [baq jıoqdıu rú] [máo báq zudıu].
I know people who work on mathematics as well as linguistics.
In this position, rú connects the incorporated object báq jıoqdıu to máo báq zudıu with high precedence. Now here's the trick: if we want rú to connect the entire sá poq phrase instead, we simply change its position.
Zao jí, [sá poq rú che guaı baq jıoqdıu], [máo sá zudıuche].
In addition to people who work on mathematics, I also know some linguists.
Whoa. What is rú doing there in the middle of the DP? The answer is that conjunctions are fundamentally inpositions. They go after the noun, but before any adjectives or objects. Thus when there are no adjectives, they look just like a postposition and can easily be added in afterthought.
Jea jí [báq sofa nạ́bıe] [báq toqfua].
I bought a sofa and then a table. (I bought, after a sofa, a table.)
We already know that the "noun" in a DP probably has to move to D, since D could be the tonal morpheme ló. So this this word order could be produced by the following pattern of movement and lowering, perhaps:
Coordination of clauses is pretty similar. The conjunction goes in the hiatus tone and we put it at the end of the clause, just before any extraposed constituents.
Chuoq ké toqfua nâ tua jí, ꝡá [shua ké jegaq rû] [tea góchıq].
Bumping into the table, I cause the vase to fall and Kitty to be scared.
Dua jí, ꝡá, [chı súq có kêo, luı sahuruaq jí], [bu luı sahuruaq jí].
I know that, although you believe that I have lied, I have not lied.
There is also a prefix form for conjunctions. You can use it to connect two verb words.
[Rụleo] [taı] som jí jégaq.
I tried and managed to repair the vase.
Pu mıu jí, ꝡá chum gaı jí sá [rạjom]… [hao].
I thought I was seeing a monster or… something.
If a verb is tightly bound to its object, as in the case of object incorporation, nouns, adjectives, or prepositions, then a prefix conjunction will expect each conjunct to come with a separate object.
Jara nháo kú [rị̂guq cóa], [tıa jío]?
Did he run under the bridge or behind the building?
In more advanced use cases, a prefix conjunction can attach to the head of a phrase to coordinate that entire phrase. This is useful for getting focus particles or another conjunction to group tightly with respect to the conjunction.
Za tı nânı [rụ́zeı rúaıfu], [rúaıjoaq] da.
In addition to the princess (for one), the king will be there.
Chum kueq [náq rạ́roı líq] [déo róı déopao].
Either the man and the woman or the child and the parent are gathering.
In cases like this it may make sense to pull out an additional modal or focus particle to signal to the listener that they should expect some tricky grouping.
Za tı nânı [rụ́zeı rúaıfu], [máo rúaıjoaq] da.
In addition to the princess (for one), the king will also be there.
Chum kueq [rạdaı náq róı líq] [daı déo róı déopao].
Either (possibly) the man and the woman or (possibly) the child and the parent are gathering.
Here máo and daı serve as spoken brackets, reinforcing the structure of the sentence without significantly altering its meaning.
Overall, as long as you can teach yourself to think of conjunctions as adpositions (which are similar, if not identical phenomena in many natural languages!), I think this direction is really promising. It gives us a grammatical explanation for why high-precedence, right-branching coordination appears to be the default while leaving the speaker with the flexibility to join things of an arbitrarily complex structure.
Discursives
Whenever we use language to have non-trivial conversations, we inevitably need to give our discourse some additional structure. This can take the form of expressions like for example, therefore, in conclusion which are fixed by rhetorical convention. Toaq has been naturally developing particles with similar meanings, and I would like to recognize them as an independent part of speech.
Discursive particles are like focus particles in that they leave the at-issue content of a sentence as it is while adding some new, backgrounded claim. Unlike focus particles, their semantics take effect on the speech act level rather than the clause level.
Tóaqzu bî, zaosu muana, lá cıa já báq fammaho.
Toaq is known for its lack of terminators, for example.
Some focus particles (máo, jóao) have corresponding discursive forms. Certain conjunctions (ru, keo, huq) can even double as a discursive particle when used in the falling tone.
Duashao jí púı raı. Joqna, tı kû hí po lo Jemu úmo?
I wonder about many things. Mainly, where the hell are we?
Ma hıq choq nháo naq kú Ítalıazu? Maona, tı kû hí po lo Jemu úmo?
Was that guy speaking Italian just now? Moreover, where on earth are we?
Tú nıe lo koakea nâ sheı chuq súq tá, keo sem shue há nhûq jí sá mea ke cıereıtaq ba.
You're free to eat anything in the fridge, but please leave some of the fish rolls for me.
As you've probably noticed, the suffix -na commonly shows up as a way of deriving discursive particles from other words. This suffix comes from the words kîo … na, which can be used to construct arbitrary discursive phrases.
Kîo juaı jí na, ma chum foa gî súq?
Seriously, are you feeling alright?
Pu chum jıe noa chíetua rú, kîo raq ní hú na, tú poq, lá coe já jíaqkoaı.
The teacher and, on that note, everybody, was having trouble connecting to the internet.
Variable names
I've become aware through my attempts to implement the determiner ló in Kuna that it faces a bit of an identity crisis. Is góchıq a definite description which requires its referent to be a cat, or it is a variable name without any further semantic content? Maybe this problem is one of my own creation, as I'd like to cleanly disentangle deictic references and anaphoric references in Kuna's semantics module. But, I can't help thinking how elegant it would be if ló picked just one of these functions.
If we say that ló simply combines with a single verb and interprets it as a variable name, then something really interesting happens: you can use ló even for proper names.
Mıu sono shuaıgı Kía ní tue da.
Kia thinks it's kind of elegantly simple this way.
Here Kía is a variable which, in the absence of any antecedent, is bound by the context to me—not because it's meant to be a description of myself in any way, but simply because I like to be known by the word Kıa.
Things get even cooler when you combine this with the pronoun structure proposal. The entire variable name becomes a sort of determiner onto which you can attach a description.
Ma mala tı súq Rókı meı?
Have you ever been to the Rocky Mountains?
He maı jí Tóaq zu.
I love the Toaq language.
Paı jî Jóq geq suq.
John, who met you, is my friend.
Kú rao nánı nâ gaı jí ké oguı. Jeojuı haı sheaq nêo óguı cıa tea jí.
It was then that I noticed the snake. I was almost standing on top of the fearless thing!
I think there's something really excellent about the ambiguity between descriptions and names that this creates. It encourages variables which once referred to an actual description of somebody/something to gel as speakers use them more and finally be understood as a proper name, all without changing form. Speakers may also find this ambiguity fun for gender reasons. And this narrows down what it means for a name to be a native Toaq name: it's a single word, and it's a verb (possibly an imagined verb).
Because variables are bound to literal words, a variable bound to a deictic or anaphoric verb may be used by the listener without mirroring of pronouns or any other adjustment.
Je, du bo jí ké echı po beı súq. Ma luı gaı súq ké jıbo?
So, it seems I actually have your keys. Have you seen mine?
Nho, nı̣tı jíbo suqbo.
Yeah, your "mine" are here.
This change does suggest that the determiner ké would generally see more use. Perhaps, to make ké less tricky, we could repurpose it from an exophoric determiner into simply, "the definite determiner". Pragmatics might still give it exophoric qualities due to the contrast with hú, but since these are not part of the direct semantic content, it now becomes appropriate to use even in situations where you're unsure whether the referent was already mentioned.
In summary
Here I've presented a vision of a Toaq in which:
Sentences and fragments segment themselves
Any small word can withstand lack of stress
Nullary verbs and trailing adjuncts still exist
Conjunctions work by general principles, with sensible, emergent defaults
Discursives and adverbs of quantification become recognized parts of speech
Generics and variables have clearly defined roles
Donkey sentences can be expressed without paraphrase
Mechanical parsers can understand entire discourses
Rigid, linear scope is balanced by appropriately flexible word order
And critically, every part of the grammar is most naturally defined in linguistic terms without any mention of parsing strategies.
This has all been constructed with care to avoid known ambiguities and allow each change to be independently justified wherever possible. It's quite an intricate puzzle, but one—I think—with very worthwhile results.
In a way this has all been about optimism, for me: that it's possible for a loglang to be used, understood, and taught just like a natural language in nearly every respect; that we can and should take segmentation issues seriously; that we don't need to compromise on Toaq's aspirations. I hope I've proven that it's all within reach.
Thank you to Laqme for providing encouragement and early feedback informing the scope of these ideas, and to Hoemai and especially you for hearing me out. I would love to know what you think about this Eatoaq.