Thursday, December 19, 2019
On Religious Sacrifice
A friend-of-a-friend post showed up in my Facebook newsfeed once that rather bothered me. Some guy I don't know posted a link to an article about the LDS-church-owned City Creek Mall, along with a bunch of (unexplicated) Bible verses on hypocrisy. This of course set off a rather extensive and heated discussion in the comments (which resulted in it showing up in my news feed when a friend commented).
The primary point of discussion was whether or not it was appropriate for the church to be involved in commercial ventures. As though capitalism and creating wealth are somehow inherently anti-spiritual? That it is possible to run a business and to be a good person at the same time, and, furthermore, to use the proceeds as an enabler of greater amounts of good in the world than could be accomplished if you were poor, is so incredibly obvious to me, and I find myself so utterly incapable of fathoming the minds of people who could think otherwise, that I really don't know how to argue this point. If you are not LDS, think what you will; if you are, chill out, dudes, and just trust that people in charge know what they're doing even if you don't get it yet.
Of course, with the recent news items about how the church has billions of dollars stashed away in reserve, the same kinds of issues are coming up all over again. Somehow, it's a shocking, outrageous thing for a church to actually >save money. Instead of, what, spending all of its assets on immediate distributions to the poor (something which the critics, of course, aren't actually doing themselves), and being left with nothing in reserve and no ability to continue to do further good?
Another common concern, however (and the source of the title of this note), seemed to be "was my tithing money used to pay for that!?" (Spoilers: At last in the case of City Creek Mall, no, it wasn't.) Followed up by agitated remarks about how "I expect my tithing to be used for..." whatever, building churches, helping the poor, etc. In short: If you are worried about what your tithing money is being used for, you are doing it wrong. Or any other donations, for that matter.
The Church should be accountable for how it uses its funds. And it is. That's why it has an auditing department. And it does in fact do a good job of being pretty transparent about the important, large-scale stuff--hence, I know that City Creek Mall was not built with tithing money. It's not completely transparent, 'cause it doesn't have to be, and mandating that would actually add to administrative overhead and lead to worse wastage of the sacred money that everybody is so worked up about. But we know what kinds of things tithing money is used for. We know what kinds of things fast offerings are used for. And we know what kinds of businesses the church owns. Nevertheless, the fact that you can go and look that stuff up is not for your benefit--it's for the Church's benefit, to ensure that the Church organization is kept in line. The emotional benefits to members of the general populace due to the fact that they must be allowed to know some of this stuff as a requirement for preventing mismanagement are purely ancillary. In other words, it's nice that you can find out a little bit about how tithing and other donations are and aren't spent, but you really shouldn't ever need to care. You shouldn't need to care anymore than you care about how your next-door neighbors manage their budget, because it's not your money.
This is because the whole point of paying tithing is to give it up. If you think along the lines of "I expect my tithing to be used to (x), not (y)", or even just "they better not be using my tithing for (z)" you have not actually given it up. Indeed, I would go so far as to say that you have not actually paid tithing at all, despite what the donation slip says--rather, you've granted the church some of your money with additional contractual stipulations known only to you in your head, and that's a very different thing. Members aren't commanded to pay tithing to finance the Church (although that's a wonderfully practical side-benefit, much like the emotional security of being able to look at financial reports)- members are commanded to pay tithing because members need to be taught to give things up to God. And the Church could accomplish that purpose regardless of where the money goes afterwards. It just doesn't matter. If 50% of the tithing budget were allocated to gold-plating all church-owned buildings and vehicles and the other half to powering the furnaces for Church office buildings by burning $100 bills (which will never happen, because see above about mismanagement, etc., but let's go with the hypothetical scenario), that should not change a single person's willingness to pay tithing or faith in the Church. If it would make a difference to you, clearly you need to keep paying tithing--you have not yet understood the meaning of sacrifice.
Furthermore, tithing is not a charitable donation. I mean, legally it is, because the church is, among other things, a charity-- but the commandment to tithe is not a commandment to give to charity. We have a whole separate thing for that--fast offerings. And giving to charity does not discharge one's responsibility to tithe. Tithing serves a completely different purpose--to teach you how to sacrifice, with a simple, concrete example. If we could all just figure that out, then maybe we'd be ready for the real deal-- the real law of sacrifice, and total consecration of all we have to the Lord.
Everybody is Smarter Than You Think
- I am imperfect at translating my thoughts into speech that can be unambiguously parsed by an arbitrary second discourse participant.
- Due to (1), I have frequently witnessed people respond to things I say in a manner that makes it obvious that they did not reconstruct the thought that I started with properly (i.e., they didn't understand what I was trying to say).
- This probably makes me seem less intelligent than I actually am, especially when the misunderstood versions of my thoughts are, in fact, wrong. Either factually wrong, or indicative of an immature point of view.
- I put a lot more thought into what I say in order to ensure accurate communication than most other people. (Although this is probably tempered by the fact that I am not neurologically typical, so I kinda have to.)
- Due to (4), most other people are probably at least as likely as I am to have the same experience with being misunderstood. The less intelligent they actually are, the more likely this is to be their own fault; the more intelligent they are, the more likely it is to be because the other discourse participants are simply incapable of formulating the correct thought (but, of course, high general intelligence does not imply high social intelligence, so it could still easily be your own fault, insofar as not knowing your audience is your own fault).
- Ergo, most people you talk to will seem, at least until you get to know them very, very well, much less intelligent than they actually are, because what you think is going on in their brains is a very degraded version of what they're actually thinking.
In short, language is a noisy, imperfect channel for thoughts. So give people the benefit of the doubt.
Tuesday, November 12, 2019
Asymmetric Cryptography For Those Who Only Know Arithmetic
This is the second in a series of posts adapted from old Facebook Notes. The original was published on February 8th, 2011.
Let's start with basic substitution cryptography, and take baby steps all the way to asymmetric public key cryptography.
The simplest sort of cipher that nearly everyone is probably familiar with are alphabetic rotation ciphers (like rot-13). The key is how many letters you're going to rotate by; i.e., for rot-13, you rotate by 13 letters, so to encrypt an 'a', you count 13 spaces along the alphabet from 'a' to get 'n', to encrypt a 'w' you count 2 spaces down to the end of the alphabet and then the remaining 11 spaces down from 'a' to get 'j', and so forth. Mathematically, this operation is known as "modular addition", or "clock addition" (because numbers wrap around like the hours on a clock face). Except in the special case of rot-13 using the 26-letter roman alphabet, where adding 13 twice gets you back to where you started, the decryption function is different from the encryption function--you use modular subtraction instead of addition, counting backwards instead of forwards to reverse the encryption.
Now, rotation ciphers are so simple that they're only really useful at all if your potential attackers don't know that you're using it, but for most ciphers of this general category you assume that the functions are well known and the basis for security is keeping the key secret. If anyone finds out your key, you're screwed. And keeping the key private is a big problem with this sort of encryption (which was pretty much the only kind of encryption around for most of history), because you have to share the key with the person you want to talk to, and the key has to be sent "in the clear" (because you haven't shared it yet, so you can't encrypt it).
Not the features of this kind of cryptography: the encryption and decryption procedures are public (everyone knows how to count letters in the alphabet), they are asymmetric (encryption involves adding, while decryption involves subtracting), and the key is private and symmetric (encryption and decryption use the same key).
Now, instead of keeping my key private, I could not worry about the key and just keep the decryption function private. In fact, I might just do away with a key entirely; an example of this is the use of invisible ink (although that's really a better example of steganography than encryption). The message can only be decrypted if you know the secret decryption function (chemical process) for the relevant type of ink. And if I'm content with one-way communication, I don't have to share the secret. I can make the key and the encryption function completely public, and just keep the really-hard-to-find decryption function to myself, and then anybody who wants to can send me a secure message that only I can decrypt. If I want to go two-way, then my partner will have to come up with his own set of public encryption and private decryption procedures. Using a rotation cipher isn't very practical here, because figuring out how to reverse modular addition with modular subtraction is really easy, so your secret function wouldn't stay secret for very long. You can imagine functions that are harder to figure out, but since finding new mathematical functions with the right properties is rather difficult, this system is not very practical. It does not make sense for everyone who wants to send private messages to have to do serious research in pure mathematics!
But what if we made the key asymmetric instead of (or in addition to) the encryption and decryption procedures? With a rotation cipher, one way to do that would be to say that if, e.g., my encryption key is 5, then my decryption key is 21, and we're always going to use modular addition, symmetrically, to encrypt and decrypt. You add 5 to each letter to get the ciphered text, and then you add 21 to each ciphered letter (completing a 26-space trip around the alphabet) to get back to where you started. Now, you can do one of two things:
- You can publish your encryption key, and keep the decryption key secret. That means other people can encrypt messages that only you can read.
- You can publish your decryption key, and the encryption key secret. That means you can write messages (digital signatures) that other people can prove came from you, because decrypting it with your published decryption key works, and no one else is capable of doing that kind of encryption.
Of course, a system based on simple modular arithmetic still isn't all that secure, because finding an additive inverse is *really easy*; in normal arithmetic, you just stick a negative sign in front (i.e., the additive inverse of 5 is -5), and in modular arithmetic you just subtract from the size of the alphabet (so the inverse of 5 in an alphabetic rotation cipher is 26-5 = 21). As soon as you publish the fact that your decryption key is 21, anyone can do some simple subtraction and figure out that your private encryption key is 5, or vice-versa. But, there are lots of mathematical operators and functions which have the nice property that some numbers have inverses in relation to those operators--not just addition. For example, the inverse of 2 over multiplication is 1/2, etc. Now, finding multiplicative inverses isn't all that difficult either, but it's a little harder than just sticking a minus sign out front, so let's run through an example using multiplication as the encryption function:
Say my public key is 2. Multiplying letters doesn't make immediate sense like counting them does, so we need to convert our letters into numbers first; fortunately, computers already represent letters as numbers anyway. So, suppose you want to secretly send me the letter 'c', which we'll assign the number 3. So, you use the encryption function (*) and the key (2) to calculate the ciphertext (3*2=6), which you send to me.
My private key is the inverse of 2-- 1/2.
I get the ciphertext 6 from you, and use the encryption function (*) with my private key (1/2) to recover the plaintext (6*1/2 = 3), which I can then interpret as the letter 'c'. Ta da! We have just accomplished encrypted communication without ever sharing a secret key.
By now, the basic proof-of-concept, that there is a way to communicate privately without ever having to arrange to share a secret with someone ahead of time, should be clear--as long as all of your adversaries are toddlers who haven't learned arithmetic. The multiplication-based system is still easy to crack, because everybody learns how to do division and multiply fractions in elementary school. But what if we did modular multiplication? Say I want to encrypt the letter 'n', which is the 14th letter of the alphabet; multiplying by my encryption key gets us 28--but there is no 28th letter of the alphabet! Instead, we wrap around and get the (28-26)=2nd letter of the alphabet--so 'n' encrypts as 'b'. What's the inverse of that? In this case, it turns out to be 20: 20*2=40, 40-26=14, 14='n'. That was a little harder to figure out, so we're getting somewhere... but it wasn't *that* hard to figure out (if it were, we wouldn't be able to figure out our own decryption keys ourselves, which would be kind of useless!), and it turns out that you aren't actually guaranteed to be able to find an inverse all the time--it's only guaranteed to work if the length of your alphabet is a prime number (which 26 is not!), so you have to pick your keys carefully, which makes it easier for an attacker to just try all of the possibilities until they guess the right one.
Real cryptosystems, therefore, get a little more complicated in a few ways. For one thing, although not all encryption functions depend on the alphabet being a prime number, most do require it to be special in some way; additionally, we want keys to be big numbers, in a large range (not just 1 to 26), so that they are harder to guess. Thus, we use more complicated ways of turning strings of letters into numbers before doing arithmetic on them. And for another thing, we have to be fairly clever about finding special mathematical functions (more complicated than just modular multiplication) which have a special property that makes it easy to generate a pair of numbers that are inverses of each other, but really hard to figure out the inverse given just one number. These are called "trapdoor functions"--it's easy to open the trapdoor from one side, but really hard to climb back out from the other side! But if you can find such a function (like, say, modular exponentiation over a range defined by multiplying two large prime numbers), so that you can't figure out what my private key is despite having the function and the public key, and then bam, you've got RSA public-key encryption & digital signing! If you know the prime numbers used to generate an RSA key, finding the inverse is trivial, so we can make keypairs for ourselves easily--but inverting a key without knowing how it was made requires factoring a large number, which is much, much more difficult than just doing division or multiplying fractions! And that is what keeps your messages over the internet safe.
NOTE: In these toy examples, and in actual RSA, the private key is the inverse of the public key, and the public key is also the inverse of the private key. This makes them interchangeable, and let's you do signing & encryption at the same time (that's generally considered a bad idea, but it can be done). However, while this seems intuitive and natural, it is not necessarily the case. There do exist functions for which a=inverse(b) does not imply that b=inverse(a), in which case the keys are not interchangeable.
Also note that, if you can factor large numbers easily, RSA becomes really easy to crack, just like our toy examples using addition and multiplication (that's what all the fuss about quantum computers is about). But while RSA was the first public-key encryption system invented, it's far from the only one nowadays, and there are plenty of more complicated encryption systems that rely on functions not-quite-as-accessible to an audience who only know arithmetic in which finding the inverses of public keys is much, much harder still.
Sunday, November 10, 2019
Why I Know So Much
Back when meeting new people was actually a thing that happened from time to time (i.e., when I was still in school), I would frequently experience people asking me "how do you know that?" or "how do you know so much?" or some variation on that theme. Usually I would give a short answer like "I read a lot", but I've decided to go into the issue a little deeper. Why do other people perceive me as knowing an unusually large quantity of stuff?
Part of it is that I know stuff about lots of widely differing fields. People often have often had misconceptions of what my degrees were in based on whatever topic I may have expounded on in their presence first, and when they encounter me then discoursing intelligently on something wildly different (like when someone who knows I'm good at physics first hears me get excited about linguistics), they are surprised. Why should they be surprised? I think it's because most people only have expertise in a very few fields, and it doesn't elicit surprise when someone reveals how much they know about Their Field. There's something surprising and impressive about having depth in more than one area. Especially since we are all so used to the fallacy of people who are experts in one area proclaiming bullcrap about other domains.
So, answering the question "how do I know so much?" at a deeper level means answering the question "how did I become knowledgeable about so many different things?" Or, "why am I not stuck on a small range of topics?" I won't tackle that directly, because I don't think I can really come up with a better answer than just "I'm interested in it all." It's a fundamental feature of my brain and my personality. And then because I'm interested in it, I go looking for it. And once you go looking for knowledge, especially when you go looking for knowledge in apparently wildly divergent directions, you get compound interest. The more you know, the easier it is to understand even more things, and the more your rate of learning will increase. Richard Hamming once said
Given two people of approximately the same ability and one person who works ten percent more than the other, the latter will more than twice outproduce the former. The more you know, the more you learn; the more you learn, the more you can do; the more you can do, the more the opportunity - it is very much like compound interest. I don't want to give you a rate, but it is a very high rate.
So if I spend even a few more minutes every day acquiring new knowledge than some other person of equivalent ability, over a year, or a few years, or 30 years, I'll end up far, far ahead. And I try not to let time go by when my brain is idle. If I can read something, I'll read it. If I can talk to someone who knows more than I do, I'll try to get them to answer my questions. If I can't read anything, and I can't talk to anybody, I'll philosophize, and try to solve problems I already know about, or try to come up with new problems that are easier to solve. Note that bit about "someone who knows more than I do": if you maintain a good, accurate understanding not only of what you have learned, but also what you know that you don't know yet, that can greatly increase your efficiency at learning new things. Now, all that philosophizing time helps make new connections, and the more stuff you have to connect, the more connections you can make. You still get compound interest if you're just studying one subject, but the exponent gets even bigger when you find that you can start making correlations between highly divergent domains. So, not only is it true that the more you know, the faster you'll know more, but also, the more different kinds of things you know, the faster you'll learn even more different kinds of things. And pretty soon, you realize that most of what you know cannot be easily put into nice neat boxes like "Math" and "English" and "Biology". Did you know that natural language processing and protein folding simulation have a lot of overlap in the types of algorithms used to run them? I know that because I happened to have lunch with a guy working on natural language processing, and a guy working on protein folding in bioinformatics. And then I went and got a degree in computational linguistics so I'd know what those algorithms were!
So, start taking a few minutes every day to think about things you don't know much about yet. Wikipedia is a great help for getting started with that. Keep it up, and pretty soon, you'll know a lot of stuff, too.
Thursday, June 22, 2017
Language & Storytelling
"Show, don't tell."
This is possibly the most oft repeated, best known bit of writing advice ever uttered. But what does it actually mean? Aside from the occasional illustration or diagram, writing is not a primarily visual medium. You are very restricted in what you can actually show. Typically, it breaks down into a metaphor for things like "give good description" and "let the reader make their own inferences"; i.e. "don't be straightforward and boring".
But there is one thing that you actually can literally show in writing, without resorting to illustrations or diagrams--one thing which nevertheless is fairly consistently ignored by most proponents of the "show, don't tell" mantra: The language itself. The language you are writing in, and the language your characters speak.
There is, however, another bit of advice which seems to preclude showing off the manipulation of language itself as a literary device in prose: that being that the language should be invisible. Ideally, the reader forgets that they are reading, becoming immersed in the story. And it turns out that things like accurately transcribed, realistic dialog, accurate phonetic representations of dialectal speech, and similar uses of showing language are super annoying and hard to read! But that does not mean that you can never show language at all--it just means that you have to be careful. When appropriate, doing so can be very powerful.
Showing and even calling attention to features of language itself is powerful because language is intimately tied up with personal and group identity. At the simplest level, this is why good characters have distinct voices; personal identity comes through in the sometime tiny idiolectal differences in how people use their language, making them identifiable even when you can't hear their literal voice. Being able to write distinct voices, different the authors own natural voice, such that the reader can tell who is speaking without the need of dialog tags is a relatively rare but extremely valuable skill, and it is all about literally showing rather than telling.
But, the significance of language to identity goes much deeper than that. Beyond individual voice, peoples' choice of style, dialect, and even what language to speak in often hinges on establishing, or refuting, membership in a group. This is a large part of the function of slang--using it to show that you're part of the "cool" crowd, watching who doesn't use it, or who uses it incorrectly to identify outsiders to your peer group, or even refusing to use it to distance yourself from that group.
One of the best examples I know of of using language to establish identity comes from the film Dances With Wolves. At one point our protagonist, John Dunbar, has been taken prisoner by the army he used to be a part of, and is being interrogated--in English, of course. Now, John is a native speaker of English, and he is clearly capable of answering in English. He could simply refuse to speak. He could say "I refuse to answer your questions", or any number of other things. But what he actually says is:
Through subtitles, the audience knows that he is telling his captors "My name is Dances with Wolves. I will not talk to you anymore. You are not worth talking to." (Although I am told a more accurate translation is "I’m Dances With Wolves. You are nothing. You are not worth words.")
Now, imagine if he simply said that, in English. Would it have nearly the same emotional impact for the viewer? Would it have the same effect on the other characters? No, of course not! By his choice of language, John is communicating his changed sense of identity--that he now considers himself not an American soldier, but a Lakota.
In films and TV shows, if you are going to portray linguistic diversity at all, you basically have no choice but to, well, actually portray it. I.e., have the actors actually speak a different language! (Even in Star Trek, with the conceit of the universal translator in effect, they at least have the alien characters speak gibberish in those rare instances when the translator breaks.) The complexity of ensuring the audience remains engaged, and the extra costs involved, more often than not result in simply avoiding or ignoring issues of linguistic diversity, but if you're going to do it, you have to do it. And when the details of the story make it implausible or politically unwise to use a real human language, you get someone to make a conlang--hence Klingon, Na'v, Dothraki, etc.
Writers, on the other hand, have a cheat. You can always just write, "My name is Dances with Wolves," he said in Lakota, and never have to learn or invent a single word of another language, or figure out how to keep the reader engaged. The language remains invisible. But you have then committed the grave error of telling when you could have shown! Indeed, of telling in the one singular situation where literal showing is actually possible!
"But," you say, "then the reader won't understand what's going on!"
Well, for one thing, maybe that's the point. Sometimes, the narrator or viewpoint character won't understand what's being said, and the reader doesn't need to understand it either. There may be legitimate disagreement on this one, but consider: is it better to write something like:
And you don't need to be J.R.R. Tolkien to do a good job with a conlang in a story. In fact, while Tolkien was a fantastic conlanger and worldbuilder, and while I may be metaphorically burned at the stake for daring to say this... Tolkien wasn't really the best at integrating different languages into his stories. Dropping in a page of non-English poetry every once in a while, while exciting for language nerds like me, generally results in readers just skimming over that page, even when the reader is a language nerd like me.
So, writers: consider the languages your characters use. Consider actually showing them. And, if appropriate, consider a conlang. If you're not up to conlanging or translating yourself, help is easy to find.
Tuesday, June 20, 2017
Stochastic Democracy
Much thought has been given to methods of ensuring fair and accurate representation in a representative democracy. Ideally, if, for example, 40% of the population supports one party and 60% of the population, then 40% of the representatives in any given governing body should be from the first party and 60% from the second (or at least, 40% and 60% of the votes should come from each party, if we relax the assumption that one rep gets one vote), within the limits of rounding errors introduced by having a small finite number of representatives.
Let us assume, however, that you have instituted a perfect system of ranked-choice voting (or something similar) and a perfect and unbiased system for drawing districts from which representatives will be selected, so that you always have such perfect representation. Or, presume that you have a perfect direct democracy, so issues of proportional representation never come up in the first place. At that point, there is still another problem to be solved: the tyranny of the majority.
The trouble is that what we really want is not proportionate representation at all; it is proportionate distribution of power. Representation is merely a poor, but easier to measure, approximation for power. And in a perfect representative government where 60% of the constituency supports one party, and 60% of the governing votes are controlled by that party, they do not have 60% of the power- they have all of it, presuming you go with a simple majority voting scheme to pass legislation. If you require some larger plurality, however, the problem still does not go away; supermajority voting requirements simply mean that it takes a larger majority to become tyrannical, and in the meantime you have a roadblock: the minority party may not be able to pass anything, but it can keep the other side from passing anything either! Nobody getting anything done may be an equitable distribution of power, but only because any percentage of 0 is still 0.
It would be better if we could somehow guarantee that a 60% majority party would get what they want 60% of the time when in conflict with the minority, and a 40% minority would get what they want 40% of the time when in conflict with the majority. It turns out that there is a remarkably simple way to guarantee this result!
Rather than allowing the passing of legislation to be decided by a purely deterministic process, we introduce an element of randomness. When any issue is voted upon, the action to be taken should not be determined simply by whichever side gets the most votes; rather, the result of a poll is a random selection from a distribution of options, weighted by the number of votes cast for each one. If all representatives always vote strictly along party lines, then over a large number of votes on issues over which the two parties disagree, 60% will be decided in favor of the majority party, and 40% in favor of the minority party- but with no way to predict ahead of time which 60 or 40% they will be, such that it is impossible to game the system by strategically timing the introduction of certain bills.
Even if we imagine a more extreme split, like 90% vs. 10%, it is still impossible under this system for the majority to run away with tyrannical measures that harm the minority for very long. Should they try, it incentivizes the minority party to introduce ever more extreme countermeasure proposals more and more frequently, one of which is guaranteed to eventually pass! The majority party therefore has its own incentive to attempt to build consensus even with small minority factions, creating legislation that will benefit everyone.
Of course, it seems highly unlikely that any country could be convinced to run their government this way in real life! But, there are I think two good reasons for considering the idea. First, this could make for an interesting political backdrop in a sci-fi or fantasy story; perhaps your fantasy society finds the inclusion of randomness in their political process to be a religious imperative, as it is through the interpretation of random events that the will of the gods is revealed. Second, it highlights to mostly-overlooked but very important distinction between equitable representation, and equitable distribution of power. Hopefully, having looked a this simple solution to the problem will help someone to discover another, perhaps more practically tenable, one.
Wednesday, January 4, 2017
Building a 4D World
Making art assets for regular games is hard enough; building full 4D models - things which allow you to look at them from any arbitrary vantage point in 4 dimensions - is just ridiculous. So procedural generation and simple geometric objects will be our friends. In fact, let's just do nothing but try to look at a randomly-generated 4D maze, constructed entirely out of right-angled, unit-sized hypercubes, as viewed through a 3D hyperplane slicing through the maze at an arbitrary angle and position.
Maze generation algorithms are all basically variations on finding spanning trees for the graph of maze cells, and as such are not strongly tied to any particular dimensionality; any of them can be fairly easily converted to work on arbitrary graphs, or in grids of arbitrary dimension. So, we'll leave maze generation as a problem space aside, and just assume that we can get a good maze map.
One way to render our view of the maze would be to represent the walls of the maze with polygons in 4 dimensions, and calculate the intersection of every edge with the currently-visible hyperplane to generate a 3D model that can be rendered in The Usual Way, using your 3D rendering library of choice.
Calculating all of those intersections, though, seems really complicated, and there's a much simpler option that takes advantage of the fact that our maze is restricted to a hypercubical grid: we can use raycasting!
Raycasting is more typically used to give the illusion of 3D with a 2D map, in games like Wolfenstein 3D. It involves marching a ray from the camera's position through the grid until you find a wall, and then drawing a column of pixels with height scaled based on the length of the ray, so that walls appear to recede into the distance. The code for this is fairly simple, and when you only have to cast one ray per pixel of horizontal resolution, its fast enough that you can even do it in pure JavaScript in a web browser. In order to render a 3D slice of a 4D map, we'll have to instead cast one ray for every pixel on the whole two-dimensional screen. That's significantly slower, but the code for casting each individual ray is not much more complex- it's merely about twice as long as that for a traditional raycaster, due to handling twice as many cases.
Casting Rays & Finding Walls
To get reasonable framerates, I implemented my 4D raycaster in GLSL to run on a GPU. The first step is calculating the appropriate ray given a pixel position:
uniform float u_depth;
uniform vec2 u_resolution;
uniform vec4 u_origin;
uniform vec4 u_rgt;
uniform vec4 u_up;
uniform vec4 u_fwd;
void main(){
vec2 coords = gl_FragCoord.xy - (u_resolution / 2.0);
vec4 ray = u_fwd*u_depth + u_rgt*coords.x + u_up*coords.y;
gl_FragColor = cast_vec(u_origin, ray, 10.0);
}
(For the uninitiated, "uniform" variables have their values passed in from outside the shader program.)
This code converts the pixel location into a new coordinate system with the origin at the center of the field of view, instead of one corner. Then it calculates a a direction in which to cast a ray based on the camera's egocentric coordinate system; the depth of the virtual camera in front of the screen specifies how much of the forward direction to use, which constrains the viewing angle, while the x and y pixel coordinates indicate how far to go along the right and up axes. Note that, so far, aside from the fact that all of the vectors have four components, this looks entirely like 3D code; there is no mention of the fourth, ana, axis. This is because we only care about casting rays contained in the hyperplane defined by the F, R, and U axes, so we only need to construct rays with components from those basis vectors. If we wanted to view a projection of the world into a 3D hyperplane, rather than a slice, then we could define an arbitrary w-resolution, in addition to the x and y resolutions, and cast additional rays with components from the ana basis vector of the camera's coordinate system. But multiplying the number of rays like that not do good things to your framerate!
(Note that, although we only use a 3D subset of the camera's egocentric coordinate system, the resulting ray can, and usually will, have non-zero components in all four of the x, y, z, and w axes. This will only fail to be the case if the egocentric coordinate system is exactly parallel to the map grid.)
After calculating the ray corresponding to a particular pixel, we pass the origin point, the ray, and a distance limit into the raycasting alorithm, which returns a pixel color. So, let's look at how that works. First, a useful utility function:
// Find the distance to the next cell boundary
// for a particular vector component
float cast_comp(vec4 v, float o, out int sign, out int m){
float delta, fm;
if(v.x > 0.0){
sign = 1;
fm = floor(o);
delta = fm + 1.0 - o;
}else{
sign = -1;
fm = ceil(o - 1.0);
delta = fm - o;
}
m = int(fm);
return length(vec4(delta,delta*v.yzw/v.x));}
This does three things:- Calculates the distance you have to move in the direction of a ray, starting from a particular origin point, before you hit a cell boundary in a particular grad-axis-aligned direction.
- Calculates the sign of the ray's propagation along a particular grid-axis direction.
- Calculates the integer position of the grid cell in which the ray originates along a particular grid-axis.
// Starting from the player, we find the nearest gridlines
// in each dimension. We move to whichever is closer and
// check for a wall. Then we repeat until we've traced the
// entire length of the ray.
vec4 cast_vec(vec4 o, vec4 v, float range){
v = normalize(v);
// Get the initial distances from the starting
// point to the next cell boundaries.
int4 s, m;
vec4 dists = vec4
cast_comp(v.xyzw, o.x, s.x, m.x),
cast_comp(v.yxzw, o.y, s.y, m.y),
cast_comp(v.zxyw, o.z, s.z, m.z),
cast_comp(v.wxyz, o.w, s.w, m.w)
);
// Inverting the elements of a normalized vector
// gives the distances you have to move along that
// vector to hit a cell boundary perpendicular
// to each dimension.
vec4 deltas = abs(vec4(1.0/v.x, 1.0/v.y, 1.0/v.z, 1.0/v.w));
The deltas give us the last bit of information we need to initialize the algorithm- how big the steps are in the direction of the ray in order to move one full grid unit in a grid-axis-aligned direction. This is different from the initial distance needed to get from an origin point somewhere inside a cell to the cell boundary, calculated above.
// Keep track of total distance.
float distance = 0.0;
// Keep track of the dimension perpendicular
// to the last cell boundary, and the value of the
// last cell the ray passed through.
int dim, value;
// while loops are not allowed, so we have to use
// a for loop with a fixed large max number of iterations
for(int i = 0; i < 1000; i++){
// Find the next closest cell boundary
// and increment distances appropriately
if(dists.x < dists.y && dists.x < dists.z && dists.x < dists.w){
dim = 1*s.x;
m.x += s.x;
distance = dists.x;
dists.x += deltas.x;
}else if(dists.y < dists.z && dists.y < dists.w){
dim = 2*s.y;
m.y += s.y;
distance = dists.y;
dists.y += deltas.y;
}else if(dists.z < dists.w){
dim = 3*s.z;
m.z += s.z;
distance = dists.z;
dists.z += deltas.z;
}else{
dim = 4*s.w;
m.w += s.w;
distance = dists.w;
dists.w += deltas.w;
}
In this section, we keep track of the length of the ray that it would take to get to the next cell boundary in any of the four grid axes, stored in the dists vector. Whichever one is smallest becomes the new length of the ray. Additionally, we record which axis we stepped along (in dim), update the coordinates of the cell through which the ray will pass next (in the m vector), and then increment the length of the theoretical ray which would hit the next cell boundary along the same grid axis.
After that, we just check to see if we've hit a wall yet. If there were any other objects in the game, this is where we'd add more detailed intersection checks for the objects within a particular cell, but with just a simple maze we just have to check the value of the cell to see if it is a wall or not:
value = get_cell(m.x, m.y, m.z, m.w);
// If the cell is a wall, or we've gone too far, terminate.
if(value == 255 || distance >= range){
break;
}
}
// Calculate the actual intersection coordinates
// and use them to look up a texture
vec4 ray = o + distance * v;
vec3 tex = calc_tex(dim, ray);
return vec4(tex, 1.0);
}Once the casting loop terminates, we can use the coordinates of the end of the completed ray to look up the color of the bit of wall we ran into
The conversion of the vec3 color to a vec4 in the last line is just to add in an alpha channel, which is always 1, and is thus left out of the procedural texture code below for simplicity.
3D Textures
Once we've found a wall at the end of a ray, the coordinate for the axis perpendicular to that wall (identified by the value of dim), will have an integer value, while the fractional parts of the remaining three coordinates describe the position inside a cube which forms the boundary of one side of the hypercubical grid cell. This is exactly analogous to the 3D situation, where we would have two non-integer coordinates describing a location inside a square that forms the boundary of a cubical grid cell, or the 2D situation (where raycasting is more typically applied) where we have one non-integer coordinate identifying a position along a line forming the boundary of a square.In order to figure out the color of a particular spot on the 3D wall of a 4D cell, we'll need three-dimensional textures. Say that we want our walls to be 512 pixels wide; for a 2D wall around a 3D cell. With one byte for each of three color channels, that means we'd need a texture taking up a little over 93KB in memory. That's entirely reasonable. We could easily have different textures for walls facing in all six directions around a cube, to help orient you in the maze. But for a 3D wall around a 4D cell, we'd need over 50 megabytes for each texture. Even if I had the digital artistic talent to create full 3D textures like that, that's rather a lot of memory to devote to textures. Once again, we can turn to procedural generation to create interesting wall textures.
There are lots of ways to do procedural texture generation, but the algorithm I picked looks basically like this:
uniform vec3 u_seed;
const vec3 grey = vec3(0.2);
const vec3 red = vec3(1.0,0.5,0.5);
const vec3 green = vec3(0.5,1.0,0.5);
const vec3 blue = vec3(0.5,0.5,1.0);
const vec3 yellow = vec3(0.71,0.71,0.5);
vec3 calc_tex(int dim, vec4 ray){
ray = fract(ray);
vec3 coords, tint;
if(dim == 1 || dim == -1){
coords = ray.yzw;
tint = red;
}
else if(dim == 2 || dim == -2){
coords = ray.xzw;
tint = green;
}
else if(dim == 3 || dim == -3){
coords = ray.xyw;
tint = blue;
}
else if(dim == 4 || dim == -4){
coords = ray.xyz;
tint = yellow;
}
float h = julia(coords, u_seed);
if(h == 0.0){
return mix(tint/16.0, grey, layered_noise(coords, 3, 3));
}
vec3 base = texture2D(u_colorscale, vec2(h, 0.5)).rgb;
return mix(tint/8.0, base, layered_noise(coords, 5, 3));
}
It starts with a basic case analysis to pick some parameters based on which direction we hit the wall from, and thus which kind of hypercube boundary cell we're calculating the texture for.
Then, the basic texture is drawn from a 3D generalization of a Julia fractal. In order to break up the psychedelic color bands produced at the edges, we layer on top of this a bunch of 3D simplex noise at multiple frequencies; the layered_noise function (implementation not shown) takes a start and a length for the range of octaves of basic noise that should be added together.
That ends up producing views something like this:
Note that not all of the walls in this scene seem to be at right angles! This despite the fact that the world consists of nothing but make walls aligned with a perfect right-angled hypercubical grid.
The reason for this is fairly straightforward; the viewing volume isn't parallel with the wall grid. Just like an oblique 2D slice through a 3D cube can end up with a hexagonal outline, a 3D slice through a 4D cube can also end up producing an outline with 2D segments meeting at all sorts of weird angles.
