String methods

Strings include properties and methods to manage their content.

Properties#

PropertyDescription
lenTotal length in characters
hashUnique numeric identifier
text = "DinoCode"
print text.len
print text.hash

Hash characteristics#

What is it for?

DinoCode optimizes several string operations by calculating a unique numeric value (hash) that represents the string. This makes it easier and faster for the interpreter to identify and find a string in certain contexts.

Is it expensive?

Not exactly. When new strings are created, the hash is not calculated at that moment, it’s lazy. It’s only calculated the first time:

  • It’s accessed through the hash property
  • The string is interned using the .intern() method
  • The string is used as a key in an object

Once calculated, it’s cached so accessing it multiple times is very efficient.

# Lazy hash occurs in dynamic strings
text = "2 + 2 = ${2 + 2}"

print text.hash  # Calculated at this moment
print text.hash  # Read from cache
Random hash per session

For security, the hash changes in each session. That is, if we run the same program several times, the hash for the same string will be different in each of those sessions.

Methods#

MethodDescription
intern()Interns the string in memory for optimization
char_at(i)Returns the character at position i
concat(s)Concatenates string s to the end
contains(s)true if it contains substring s
starts_with(s)true if it starts with s
ends_with(s)true if it ends with s
trim()Removes whitespace from the ends
trim(chars)Removes specific characters from the ends
to_uppercase()Converts to uppercase
to_lowercase()Converts to lowercase
repeat(n)Repeats the string n times
split()Divides the string by spaces, returning an array
split(delim)Divides the string using a delimiter
replace(a b)Replaces occurrences of a with b
substr(start end)Extracts text from index start to end
substr(start)Extracts text from index start to the end
index_of(s)Index of the first occurrence of s
index_of(s pos)Searches for s starting from index pos
last_index_of(s)Index of the last occurrence of s
last_index_of(s pos)Searches for s backwards from pos
pad_left(w)Pads left with spaces to length w
pad_left(w c)Pads left using character c
pad_right(w)Pads right with spaces to length w
pad_right(w c)Pads right using character c

Interning characteristics - .intern()#

What does it consist of?

Interning consists of avoiding unnecessary duplications of a string. Interned strings share the same reference in memory if they are completely equal.

What advantages does it have?

DinoCode recognizes strings that were interned. This allows it to radically optimize comparisons between them, since instead of comparing the content, it only compares their references (same performance as comparing two numbers in Rust).

What disadvantages does it have?

For interning to work, we must pay a computational cost at the moment we intern the string (.intern()) which consists of calculating the hash of the string and searching for it within a centralized table. Although finding the element in the table is almost instantaneous, the complete process has a real cost that grows according to the size of the string.

# Interning is not necessary in constant strings
# the compiler already took care of interning them
text1 = "Hello, my name is Ismael"
text2 = "Hello, my name is Ismael"

# They already share the same reference in memory
print id(text1) " is equal to " id(text2)

# When we create a dynamic string
#  we lose the advantage of interning because
#  it could not be interned at compile time
name = "Ismael"
text3 = "Hello, my name is $name"

print id(text3) " is different from " id(text1)

# If we use intern() the interpreter will take care
# of interning the string
text4 = "Hello, my name is $name".intern()

print id(text4) " is equal to " id(text1)
Lazy by nature

Interning is lazy by nature. Basically you pay a small price when processing the string at the beginning in exchange for getting instant comparisons later.

Examples#

Length and validation#

text = "Hello World"

print text.len        # 10

if text.starts_with("Hello")
    print "Starts with Hello"
if text.ends_with("World")
    print "Ends with World"
if text.contains("World")
    print "Contains World"

Transformations#

text = "  Hello World  "

print text.trim()          # "Hello World"
print text.to_uppercase()  # "  HELLO WORLD  "
print text.to_lowercase()  # "  hello world  "
print "Hi".repeat(3)        # "HiHiHi"

Search and extraction#

text = "Hello World"

print text.index_of("World")  # 5
print text.substr(0 4)        # "Hello"
print text.substr(5)          # "World"
print text.replace("World" "Dino") # "Hello Dino"

Division and Padding#

# Division
fruits = "apple,grape".split(",")
print fruits    # ["apple" "grape"]
print fruits[1] # "grape"

# Padding
number = "42"
print number.pad_left(5 "0")   # "00042"
print number.pad_right(5 "-")  # "42---"