Tagged: artificial language Toggle Comment Threads | Keyboard Shortcuts

  • Profile photo of nmw

    nmw 19:04:09 on 2016/05/27 Permalink
    Tags: , artificial language, artificial languages, , , , , , , emerge, , human intelligence, , , , , , , , , , , , , , , , , , traing set, traing sets, , , , ,   

    Literacy and Machine Readability: Some First Attempts at a Derivation of the Primary Implications for Rational Media 

    Online, websites are accessed exclusively via machine-readable text. Specifically, the character set prescribed by ICANN, IANA, and similar regulatory organizations consists of the 26 characters of the latin alphabet, the „hyphen“ character and the 10 arabic numbers (i.e. The symbols / zyphers 0-9). Several years ago, there was a move to accommodate other language character sets (this movement is generally referred to as „Internationalized Domain Names“ [IDN]), but in reality this accommodation is nothing more than an algorithm which translates writing using such „international“ symbols into strings from the regular latin character set, and to used reserved spaces from the enormous set of strings managed by ICANN for such „international“ strings. In reality, there is no way to register a string directly using such „international“ characters. Another rarely mentioned tidbit is that this obviously means that the set of IDN strings that can be registered is vastly smaller than strings exclusively using the standardized character set approved for direct registration.

    All of that is probably much more than you wanted to know. The „long story short“ is that all domain names are machine readable (note, however, that – as far as I know – no search engine available today on the world-wide-web uses algorithms to translate IDN domain name strings into their intended „international“ character strings). All of the web works exclusively via this approved character set (even the so-called „dotted decimals“ – the numbers which refer to individual computers [the „servers“] – are named exclusively using arabic numerals, though in reality are based on groups of bits: each number represents a „byte“-sized group of 8 bits… in other words: it could be translated into a character set of 256 characters. In the past several years, there has also been a movement to extend the number of strings available to accommodate more computers from 4 bytes (commonly referred to as Ipv4 or „IP version 4“) to 6 bytes (commonly referred to as Ipv6 or „IP version 6“), thereby accommodating 256 x 256 = 65536 as many computers as before. Note, however, that each computer can accommodate many websites / domains, and the number of domain names available excedes the number of computers available by many orders of magnitude (coincidentally, the number of domain names available in each top level domain [TLD] is approximately 1 x 10^100 – in the decimal system, that’s a one with one hundred zeros, also known as 1 Googol).

    Again: Very much more than you wanted to know. 😉

    The English language has a much smaller number of words – a very large and extensive dictionary might have something like 100,000 entries. With variants such as plural forms or conjugated verb forms, that will still probably amount to far less than a million possible strings – in other words: about 94 orders of magnitude less than the number of strings available as domain names. What is more, most people you might meet on the street probably use only a couple thousand words in their daily use of „common“ language. Beyond that, the will use even fewer than that when they use the web to search for information (for example: instead of searching for „sofa“ directly, they may very well first search for something more general like „furniture“).

    What does „machine readable“ mean? It means a machine can take in data and process it algorithmicly to produce a result – you might call the result „information“. For example: There is a hope that machines will someday be able to process strings – or even groups of strings, such as this sentence – and be able to thereby derive („grok“ or „understand“) the meaning. This hope is a dream that has already existed for decades, but the successes so far have been extremely limited. As I wrote over a decade ago (in my first „Wisdom of the Language“ essay), it seems rather clear that languages change faster than machines will ever be able to understand them. Indeed, this is almost tautologically true, because machines (and so-called „artificial intelligence“) require training sets in order to learn (and such training sets from so-called „natural language“ must be expressions from the past – and not even just from the past, but also approved by speakers of the language, i.e. „literate“ people). So-called „pattern recognition“ – a crucial concept in the AI field – is always recognizing patterns which have been previously defined by humans. You cannot train a machine to do anything without a human trainer, who designs a plan (i.e., an algorithmic set of instructions) which flow from to human intelligence.

    There was a very trendy movement which was quite popular several years ago that led to the view that data might self-organize, that trends might „emerge from the data“ without needing the nuissance of consulting costly humans, and this movement eventually led to what is now commonly hyped as „big data“. All of this hype about „emergence“ is hogwash. If you don’t know what I mean when I say „hogwash“, then please look it up in a dictionary. 😉

     
  • Profile photo of nmw

    nmw 17:57:54 on 2016/05/23 Permalink
    Tags: , artificial language, character, characters, , , , , , indexes, , , informationretrievel, intelligability, intelligable, , , , , , , , , , , ,   

    Fundamental Principles of Rational Media 

    In my previous post, I noted that my concept of rationality differs from the general, widely accepted views of this notion. I do not disagree with these views. Instead, I believe the way I view rationality is more generalized.

    To put it simply: Rationality can be interpreted as any idea – in other words: any idea can be considered rational – if it can be expressed in language. What language is / isn’t – that’s perhaps a more difficult question to answer, but as mathematics is one such language… and as logic, i.e. „mathematical logic“ can be interpreted as a subset of mathematics, logic can also be interpreted as a language.

    Most so-called „programming“ languages are also, well: languages. „Natural“ languages are also languages (indeed: the distinction between „natural“ language and „artificial“ language is really not very distinct, clear, obvious or anything like that). And as I mentioned in my previous post, even facial expressions, scents, DNA and many other things can also be interpreted as language.

    In the context of „rational media“, however, I suggest limiting the meaning of the expression to what is often referred to as „machine readable“ language. I would even suggest limiting the extent of „rational media“ more than that, because there are actually many types of machine-readable expressions which are usually considered to be unintelligible by humans without machines. For example: Hollerith cards, magnetic tape and discs, compact discs, usb sticks, bar codes and QR-codes to name just a few. There are also some expressions which are simply difficult to express in the traditional notion of natural language – for example: numerical values written in hexadecimal formats.

    All of this is by and large simple and straightforward in an online setting, because web addresses are almost all written using what most people consider to be natural language expressions (though note that so-called „international domain names“ / IDNs are written in a code which allows for algorithmic translation between the latin character set used in all domain names to transformed expressions in specialized character sets [and vice versa] ). In general, surfing the web is very much like using an encyclopedia, a lexicon or what used to be called a „card catalog“. The primary difference is that whereas the web is considered to be distributed, the traditional forms were usually viewed as created by a single author, organization or institution. Therefore, whereas for many decades and even centuries people had become very accustomed to indexes being something created by specialized „indexers“ or „indexing services“, today the „index“ to the web is considered to be integrated into the web itself (note, however, that the registries of „top level domains“ [TLDs] are actually sort of like the „indexes of last resort“ … that is, „last resort“ excluding ICANN).

    I will simply abruptly stop here for now – as I feel this is probably already quite a lot to digest. If you would like to add comments, ideas, questions or anything like that, please feel free to register @ nooblogs.com, which is intended to be more for discussion and/or sharing of ideas.

     
  • Profile photo of nmw

    nmw 15:16:27 on 2016/03/04 Permalink
    Tags: , , , , , , artificial language, content. Wordpress, , , functional, , , , intelligences, , , , , procedural, procedure, procedures, refer, reference, relate, , , , , , technologies, , , , , , ,   

    Limitations in the WordPress Notifications algorithm 

    Ted and Brandon’s most recent episode of the „Concerning AI“ podcast is a very rewarding listen… – mainly because of their thinking with respect to compassion towards (or against) algorithms.

    Having compassion towards or against an algorithm seems like a very strange concept, and I feel I very much agree with Ted and Brandon’s thinking during the episode, but I also want to use the suggestion as a „what if“ sort of springboard.

    Ted and Brandon provided several examples algorithms (and/or tools). Perhaps the quintessential example is the hammer (for pounding nails). Another example they provided was the so-called „Google“ algorithm (presumably counting the links that point to any particular internet address, in order to „load the value“ of that address. Another algorithm they mentioned was an „alpha“ (sp?) Go algorithm. One they didn’t mention was the Facebook Group algorithm, which they employ for the purposes of facilitating discussions related to the podcast. Another algorithm (or perhaps „procedural code“ might be a more appropriate term) they didn’t mention is the WordPress Notifications procedure (or function?) … which attempts to notify the management of a site running WordPress when content on the site is mentioned. I am not exactly sure how it works – but I think both sites might have to be running WordPress (or at least software that is compatible with the notification procedure / function)… thereby enabling one site to send the other site some message indicating that the latter site was referenced by the first site. In traditional publishing, such references were called „footnotes“, and there was indeed also a tool in the paper era that notified authors when something they wrote had been cited (these were referred to „citation indexes“).

    I am belaboring this one algorithm (or procedure or function or whatever sort of code it might be) primarily because I think it could be coded better. As far as I know, whenever I mention the site concerning.ai in general, the concerning.ai site is not notified. The only way the concerning.ai site can be notified by my mentioning it is if I mention a particular piece of content – for example: Episode Number 14. I think it would be nice if the site would be notified even if I only refer to the site in general.

    Ted and Brandon discuss that they don’t feel as if they can empathize with any of the algorithms they mention – but I feel the probably do. If they want to play Go, then they will probably be more likely to „hang out“ with a Go algorithm. If they want to meet people, they might be more likely to „hang out“ with a Facebook algorithm. If they want to watch Youtube videos, they might search for such information directly on Youtube, or perhaps the might utilize the Google search algorithm (in particular because Google and Youtube are apparently very closely related).

    I have a hunch that the best way to think about this is via the concept of relationships. When my aim is to pound nails, then I will probably develop a close relationship with a hammer. If my aim is to play Go, then I could develop a relationship with algorithms devoted to Go (perhaps alpha-go.com or maybe play-go.net etc.), or perhaps I could input strings into some other algorithm (e.g. Google, Facebook, Youtube, etc.) and use whatever output I get in order to reach my goal. This might also work for the goal „have a conversation“. Indeed: many written texts are in a way conversations, and we often develop relationships with codices that are no longer limited to the life spans of their authors, etc. I don’t even know who invented hammers. I mainly simply think of them as „hammer“.

    Please note that I have tried to make this post very brief. Lawrwnce Lessig has argued about the code in so-called “artificial languages” being like laws. I could equally well argue that the code in laws codified in so-called “natural language” are actually code. For more on this, please consider also reading “How to Constrain the Freedom to Choose the Best of all Possible Worlds During an Era of Uninterrupted Progress“.

     
  • Profile photo of nmw

    nmw 16:17:23 on 2016/02/20 Permalink
    Tags: , , , artificial language, , , , , , , , , , , , , , , ,   

    In Our Brains… 

    In our brains, almost everything is connected to the world outside of our brains. Thinking about artificial intelligence (AI), my friends Ted and Brandon are asking for help (@http://concerning.ai). In my humble opinion: If you want to „get somewhere“ then you need to think „outside of the box“.

    What I’m writing here has mainly to do with things Brandon and Ted talk about in episode 10. Also, in episodes 11 and 12, Brandon and Ted talk with Evan Prodromou, a „practitioner“ in the field. Evan points out (at least) two fascinating points: 1. Procedural code and 2. Training sets. Below, I will also talk about these two issues.

    When I said above that there is a need to „think out side of the box“, I was alluding to much larger systems than what is usually considered (note that Evan, Ted and Brandon also touched on a notion of „open systems“). For example: Language. So-called „natural language“ is extremely complex. To present just a shimmer of the enormous complexity of natural language, consider the „threshold anecdote“ Ted shared at the beginning of episode 11. A threshold is both a very concrete thing and also an abstract concept. When people use the term „threshold“, other people can only understand the meaning of the term by at the same time also considering the context in which the term is being used. This is for all practical purposes an intractable problem for any computational device which might be constructed by humans sometime in the coming century. Language itself does not exist in one person or one book, but it is something which is distributed among a large number of people belonging to the same linguistic community. The data is qualitative rather than qantitative. Only the most fantastically optimistic researchers would ever venture to try to „solve“ language computationally – and I myself was also once one such researcher. I doubt humans will ever be able to build such a machine… not only due to the vast resources it might require, but also because the nature of (human) natural language is orthogonal to the approach of „being solvable“ via procedural code.

    Another anecdote I have often used to draw attention to how ridiculous the aim to „solve language“ seems is Kurzweil’s emphasis on pattern recognition. Patterns can only be recognized if they have been previously defined. Keeping with another example from episode 11, it would require humans to walk from tree to tree and say „this is an ash tree“ and „that is not an ash tree“ over and over until the computational device were able to recognize some kind of pattern. However, the pattern recognized might be something like „any tree located at a listing of locations where ash trees grow“. Indeed: The hope that increasing computational resources might make pattern recognition easier underscores the notion that such „brute force“ procedures might be applied. Yet the machine would nonetheless not actually understand the term „ash tree“. A computer can recognize what an ash tree is IFF (if and only if) a human first defines the term. If a human must first define the term, then there is in fact no „artificial intelligence“ happening at all.

    I have a hunch that human intelligence has evolved according to entirely different laws – „laws of nature“ rather than „laws of computer science“ (and/or „mathematical logic“). Part of my thinking here is quite similar to what Tim Ferris has referred to as „not-to-do lists“ (see „The 9 Habits to Stop Now“). Similarly, it is well-known that Socrates referred to „divine signs“ which prevented him from taking one or another course of action. You might also consider (from the field of psychology) Kurt Lewin’s „Field Theory“ (in particular the “Force Field Analysis” of positive / negative forces) in this context, and/or (from the field of economics) the „random walk“ hypothesis. The basic idea is as follows: Our brains have evolved with a view towards being able to manage (or „deal with“) situations we have never experienced before. Hence „training sets“ are out of the question. We are required to make at best „educated“ guesses about what we should do in any moment. Language is a tool-set which has symbiotically evolved in our environment (much like the air we breathe is also conducive to our own survival). Moreover: Both we and our language (as also other aspects of our environment) continue to evolve. Taken to the ultimate extreme, this means that the coexistence of all things evolving in concert shapes the intelligence of each and every sub-system within the universe. To put it rather plainly: the evolution of birds and bees enables us to refer to them as birds and bees; the formation of rocks and stars enables us to refer to them as rocks and stars; and so on.

    In case you find all of this somewhat scientific theory too theoretical, please feel free to check out one of my recently launched projects – in particular the „How to Fail“ page … over at bestopopular.com (which also utilizes the „negative thinking“ approach described above).

     
c
Compose new post
j
Next post/Next comment
k
Previous post/Previous comment
r
Reply
e
Edit
o
Show/Hide comments
t
Go to top
l
Go to login
h
Show/Hide help
shift + esc
Cancel
Skip to toolbar