This course has been updated! We now recommend you take the The Good Parts of JavaScript and the Web course.

Check out a free preview of the full JavaScript the Good Parts course:
The "Strings and Arrays" Lesson is part of the full, JavaScript the Good Parts course featured in this preview video. Here's what you'd learn in this lesson:

A look at the way Strings behave in JavaScript and common String operations. In many languages, an Array is linear sequence of memory divided into equal sized blocks. In JavaScript, Arrays behave very differently. Sometimes they can be more efficient, most of the time they are less efficient.

Get Unlimited Access Now

Transcript from the "Strings and Arrays" Lesson

[00:00:00]
>> [MUSIC]

[00:00:04]
>> Douglas Crockford: We have two Boolean values, you'll be glad to know that they're true and false, so we got that right.
>> Speaker 2: [LAUGH]
>> Douglas Crockford: Then we have strings. Strings are a sequence of characters. Do you know why they are called strings? It turns out nobody knows. It's mystery, the earliest occurrence I could find in the literature of someone referring to string as a type of which of the data structure being a sequence of characters was in over 60 report.

[00:00:36] And I asked John McCarthy who was on that committee, why did you guys call them strings? And he got angry with me.
>> Speaker 2: [LAUGH]
>> Douglas Crockford: And he since passed away, so I can't follow up with him. So we don't know why they're called strings. Earlier there was a sense of having strings of things like strings of symbols or strings of statements.

[00:00:58] So a program was a string of statements, but string meaning specifically a text containing a sequence of characters. We don't know why we call that, and doesn't make sense that we call it that, but we do. So sequences 0 or more 16 bit Unicode characters. This is UCS-2 which quite same this as UTF-16.

[00:01:21] Because JavaScript have no knowledge of the surrogate pairs which get you out of the basic plane. Now it turns out surrogate pairs can easily transit through the system, and if your operating system knows about them, but not display correctly. But JavaScript will always count them as two characters, not as a single paired character.

[00:01:42] JavaScript does not have a character type. So if you want to talk about a single character that's just a string with length one. Strings are immutable, which means once a string is made it cannot be modified. That turns out to be a really good thing.
>> Douglas Crockford: So if you want to change a string, you actually produce a new string.

[00:02:01] Which might contain parts of the content of the previous one. Similar strings are equal under triple equal, which is great. Java got that wrong, JavaScript got that right. Hooray for JavaScript. So if you have two strings containing the same sequence of characters, they're equal. String literals can be written with either single quotes or double quotes, and they both work the same way with \ for escapement.

[00:02:29] So since we have two ways of writing strings, I use double quote for external strings. Things like HTML templates, or text to the users, or stuff like that. And I use single quote for internal strings and characters, names of properties, not that's internal to the program. This is just a convention, there's nothing to enforce it, but I think it makes it easier to understand the use of a string within a program.

[00:02:58] We have a plus operator which unfortunately both ads and concatenating. This was a bad idea lifted from Java. In Java it's not so bad because Java is strongly typed, so you can predict which one it's going to do. JavaScript being loosely typed, it's often hard to predict what it's going to do, and often does things wrong.

[00:03:17] So a common problem is, someone will take a number out of a form field, and want to add it to another number. The value that you get from a form field is always a string, but it's hard to remember sometimes because it's labeled number. And you're told the user to type in a number, and you wanna take the number of added input when you add it, you actually concatenate it.

[00:03:45] Which can cause the answer to be asked by many thousands which is a bad thing, and that's really a common error happens all the time. There are number of ways of converting a number in to a string. Numbers being objects have two string methods. There's also a string function which is more commonly used, you pass the thing there.

[00:04:10] Be sure not to put new in front of the string function, cuz that does something completely different. You can then go the other way, and convert a string to a number. You can use the number function, or you can use the + prefix operator which is the preferred way because it's the shortest.

[00:04:30] There's also the parseInt function which takes a string and a radix, which is the base of the numbers being converted from. ParseInt will stop parsing once it sees a character which it is not a digit in the base that it's working in. And it doesn't tell you that it stopped early, and it doesn't tell you what the remaining characters are.

[00:04:54] So it's a little difficult to use as a parseInt function, despite its name. And it is problematic if there is any possibility that the number that you're trying to parse has a leading 0 in it. Which is very common for dates and times. Anything that by convention has a leading 0.

[00:05:14] Because it does the stupid see thing of thinking that leading 0 means octal. So now it's in base 8, and if it sees an 8, 8 is not a digit in base 8, so it stops at that point. So parseInt 08 is 0, not 8, really common error.

[00:05:35] So the defense against that is to always put in the 10 radix.
>> Douglas Crockford: And it'll always do the right thing. It's very unlikely that you're gonna parse anything that's not base 10. But occasionally, base 16, in that case, you'll definitely want to make explicit as well. Strings have a length property, so methods of properties, and it tells you how many 16-bit code points are in the string.

[00:06:05] Strings have lots of methods, again, strings are objects. So there's charAt which will return a single character within the string. There's indexOf that does searching. Search also does searching. There's splice, split, and substring, which can return parts of the string, lots of useful things. There's now a trim method which was added in ES5.

[00:06:34] Which is good because prior to that we didn't have one, that was a big omission. So if you are in ES6, and you want to be able to trim, that turns out you can fix it, you can add something to string that prototype. And awesomely, all of your strings will know how to trim themselves.

[00:06:54]
>> Douglas Crockford: Have a supplant method. So here we have a template, an HTML template with a couple of with three variables in it. And we have a data structure which has properties whose names match those variables. So the obvious thing you'd want to do is to take the template, and do the replacement.

[00:07:14] You can do that with the supplant method on the template. Unfortunately, JavaScript doesn't have a method like that. But it will be easier to write one, so we can go to String.prototype.supplant, and add a method which would do that very thing, using a replace method. Passing at regular expression which is looking for those patterns, and which is also given a function which will make the substitution from the object.

[00:07:47] So an array is a linear sequence of memory which is divided into equal size pockets which can be indexed by number which allows for a very fast way of retrieving variables from a set. JavaScript doesn't have anything like that. Instead, JavaScript has objects, and it adds a little bit of magic to an object to creates something that calls an array.

[00:08:12] But which acts very differently than real arrays, in some ways much better, in some ways much worse. So array inherits from object, the indexes which you would put in the brackets are converted into strings. And then hashed into the object that backs the array in order to retrieve the value, it was very efficient for sparse arrays.

[00:08:36] A sparse array is an array where most of the cells are gonna be empty, JavaScript kicks ass at that. Unfortunately, in most of our applications we don't use sparse arrays, we use dense arrays, and JavaScript is really inefficient at that. But there is one advantage here, when you're creating a new array, you don't have to say what length it is.

[00:08:55] And you don't have to say what type is going to go into it, because any type can go into it, and it doesn't actually have a link. So you can put as much stuff in there as you want, and it will always fail until memory is exhausted. Arrays have a special link property which does not contain the number of elements on the array.

[00:09:19] What it does? It contain is one more than the highest integers subscript in the array. Now for dense arrays, that means the same thing, but for sparse arrays means something very different. And the reason for having that is so that you can loop through an array using a for statement.

[00:09:41] And have it work pretty much the way it does in other languages. You could use for in on arrays, but I recommend that you don't. The reason is that, usually in arrays we expect to visit the elements in the right order. For in does not guarantee the order in which it's going to return the values.

[00:10:04] And so you don't know for sure that you're gonna get the correct sequence, and that could cause some programs to fail. We have a really nice array literal formats similar to the object literal format. So here I've got an array containing three elements, and it's length is four.

[00:10:24] Because I'm sorry, it's like those three and the elements are numbered zero, one, and two. I can add a new method, or a new property, new element, to the end of an array by assigning to the array's length. Because the array's length is always one bigger than the highest numbered element.

[00:10:43] This guarantees this element will be one bigger than the previous one, and so you can always append to that. Arrays are objects, and so the dot notation works, but the dot notation doesn't work with numeric numbers, because gets confused with decimal point. So you should always use the bracket notation with arrays, except when calling methods.

[00:11:10] There are lots of useful methods on arrays, particularly some of the new methods like map and forEach that were added in the ES5. Which allow you to take a function and apply it to each of the elements of the array.
>> Douglas Crockford: Arrays have a sort function which unfortunately does horrible things.

[00:11:31] So here we have an array of numbers, and we sort it, and it comes out like that. Anyone guess what happened?
>> Speaker 3: [INAUDIBLE].
>> Douglas Crockford: Yeah, it sorts in the strings. On each comparison, it forces those two numbers to be strings, comparison of strings. So it's wildly inefficient, and it gives you the incorrect results.

[00:11:59] Fortunately, it's possible to pass a compare function to sort where we can say do it my way. Where you can subtract two numbers and get the right result. By default for numbers it does something that is tragically wrong. You can delete elements from an array, using the delete operator.

[00:12:21] But delete will not do what you want, because it will leave a hole in the array. There's a splice method which will close up the hole, let me show you an example of this. So here I have an array of four elements, and I want to delete the b, which is at index 1.

[00:12:40] So I could delete myArray of 1, but what I get back now is an array with a hole in it, because element 1 is now undefined. Undefined is the value of missing elements. So if I wanna make that hole go away, I could call a myArray.splice passing it a one meaning start at index 1, and the other one meaning, delete 1 element.

[00:13:04] And the way that it does that is that it then goes to element 2, and deletes it and reasserts it as element 1. And goes to element 3, deletes it and re inserts it as element 2. So that's how it closes up the hole, so it's going to be slow.

[00:13:24] So appending things to the end of arrays tends to be fast, removing things from anywhere except the end tends to be very slow.