String methods

Strings include properties and methods to manage their content.

Properties#

Property	Description
`len`	Total length in characters
`hash`	Unique numeric identifier

text = "DinoCode"
print text.len
print text.hash

Hash characteristics#

What is it for?

DinoCode optimizes several string operations by calculating a unique numeric value (hash) that represents the string. This makes it easier and faster for the interpreter to identify and find a string in certain contexts.

Is it expensive?

Not exactly. When new strings are created, the hash is not calculated at that moment, it’s lazy. It’s only calculated the first time:

It’s accessed through the hash property
The string is interned using the .intern() method
The string is used as a key in an object

Once calculated, it’s cached so accessing it multiple times is very efficient.

# Lazy hash occurs in dynamic strings
text = "2 + 2 = ${2 + 2}"

print text.hash  # Calculated at this moment
print text.hash  # Read from cache

Random hash per session

For security, the hash changes in each session. That is, if we run the same program several times, the hash for the same string will be different in each of those sessions.

Methods#

Method	Description
`intern()`	Interns the string in memory for optimization
`char_at(i)`	Returns the character at position `i`
`concat(s)`	Concatenates string `s` to the end
`contains(s)`	`true` if it contains substring `s`
`starts_with(s)`	`true` if it starts with `s`
`ends_with(s)`	`true` if it ends with `s`
`trim()`	Removes whitespace from the ends
`trim(chars)`	Removes specific characters from the ends
`to_uppercase()`	Converts to uppercase
`to_lowercase()`	Converts to lowercase
`repeat(n)`	Repeats the string `n` times
`split()`	Divides the string by spaces, returning an array
`split(delim)`	Divides the string using a delimiter
`replace(a b)`	Replaces occurrences of `a` with `b`
`substr(start end)`	Extracts text from index `start` to `end`
`substr(start)`	Extracts text from index `start` to the end
`index_of(s)`	Index of the first occurrence of `s`
`index_of(s pos)`	Searches for `s` starting from index `pos`
`last_index_of(s)`	Index of the last occurrence of `s`
`last_index_of(s pos)`	Searches for `s` backwards from `pos`
`pad_left(w)`	Pads left with spaces to length `w`
`pad_left(w c)`	Pads left using character `c`
`pad_right(w)`	Pads right with spaces to length `w`
`pad_right(w c)`	Pads right using character `c`

Interning characteristics - `.intern()`#

What does it consist of?

Interning consists of avoiding unnecessary duplications of a string. Interned strings share the same reference in memory if they are completely equal.

What advantages does it have?

DinoCode recognizes strings that were interned. This allows it to radically optimize comparisons between them, since instead of comparing the content, it only compares their references (same performance as comparing two numbers in Rust).

What disadvantages does it have?

For interning to work, we must pay a computational cost at the moment we intern the string (.intern()) which consists of calculating the hash of the string and searching for it within a centralized table. Although finding the element in the table is almost instantaneous, the complete process has a real cost that grows according to the size of the string.

# Interning is not necessary in constant strings
# the compiler already took care of interning them
text1 = "Hello, my name is Ismael"
text2 = "Hello, my name is Ismael"

# They already share the same reference in memory
print id(text1) " is equal to " id(text2)

# When we create a dynamic string
#  we lose the advantage of interning because
#  it could not be interned at compile time
name = "Ismael"
text3 = "Hello, my name is $name"

print id(text3) " is different from " id(text1)

# If we use intern() the interpreter will take care
# of interning the string
text4 = "Hello, my name is $name".intern()

print id(text4) " is equal to " id(text1)

Lazy by nature

Interning is lazy by nature. Basically you pay a small price when processing the string at the beginning in exchange for getting instant comparisons later.

Examples#

Length and validation#

text = "Hello World"

print text.len        # 10

if text.starts_with("Hello")
    print "Starts with Hello"
if text.ends_with("World")
    print "Ends with World"
if text.contains("World")
    print "Contains World"

Transformations#

text = "  Hello World  "

print text.trim()          # "Hello World"
print text.to_uppercase()  # "  HELLO WORLD  "
print text.to_lowercase()  # "  hello world  "
print "Hi".repeat(3)        # "HiHiHi"

Search and extraction#

text = "Hello World"

print text.index_of("World")  # 5
print text.substr(0 4)        # "Hello"
print text.substr(5)          # "World"
print text.replace("World" "Dino") # "Hello Dino"

Division and Padding#

# Division
fruits = "apple,grape".split(",")
print fruits    # ["apple" "grape"]
print fruits[1] # "grape"

# Padding
number = "42"
print number.pad_left(5 "0")   # "00042"
print number.pad_right(5 "-")  # "42---"