![]() |
Genie strings
|
There is not yet an official logo for Genie/Vala, above image just a placeholder.
Introduction
I have introduced the string datatype in my page Genie data types.Read that first. This page continues on, with a focus on using the string functions.
The main places to look for official documentation:
http://library.gnome.org/devel/glib/stable/glib-String-Utility-Functions.html
/usr/share/vala/vapi/glib-2.0.vapi
/usr/share/vala/vapi/glib-2.0.vapi
These are the available functions:
canon, chomp, chr, chug, compress, concat, contains, down, escape, has_prefix, has_suffix, len, ndup, printf, replace, reverse, scanf, split, str, strip, substring, to_double, to_int, to_int64, to_long, to_ulong, up
Summary of string functions
| canon |
|
| chomp |
Removes trailing whitespace from string |
| chr |
|
| chug |
Removes leading whitespace from string |
| compress |
|
| concat |
Append one string to another |
| contains |
Tests if substring is in string |
| down |
Convert all letters to lower-case |
| escape |
|
| has_prefix |
Looks whether string begins with prefix |
| has_suffix |
Looks whether string ends with suffix |
| len |
Returns the length of the string |
| ndup |
Duplicates the first n bytes of a string |
| printf |
|
| replace |
Replace a substring in string |
| reverse |
Reverses string, 'abcde' becomes 'edcba' |
| scanf |
|
| split |
Using a delimiter, splits into array of strings |
| strip |
Removes leading and trailing whitespace from string |
| substring |
Obtain substring at offset in string |
| to_* |
Converts string to a scalar datatype |
| up |
Convert all letters to upper-case |
A simple example
Trying out a few of the functions...initNotice that you don't use a compare function to test equality, use the generic == operator. Or use != to test if not equal.
var s = "The quick brown fox"
s = s.concat(" jumped over the")
s += " lazy dog"
var s2 = s.ndup(s.len())
if s == s2 do print "s and s2 have the same strings"
print s2
A real application
I would like to show usage of the string functions to solve a real problem. In Puppy Linux we have a file /etc/rc.d/PUPSTATE, that has entries like this:#The partition that has the pup_save file is mounted here...This was designed to be inserted into a Bash script, which is a one-liner:
PUP_HOME='/mnt/dev_save'
. /etc/rc.d/PUPSTATEBash and friends, being interpreted systems, are able to evaluate source code at runtime, which effectively is what the "." operator does.
Genie being a pure compiled language does not offer runtime evaluation of source code, so some kind of more elaborate code is going to be needed to achieve the same end.
This is long-winded, as we have to do some limited parsing of the source code in the PUPSTATE file...
[indent=4]The variables have been read from /etc/rc.d/PUPSTATE and are now in a dictionary named PUPSTATE. That's fine. For example, one of the lines in /etc/rc.d/PUPSTATE is "PUPMODE=12" so now we have that as key:value entry in the dictionary, and it is easy enough to access that later in the program:
init
var PUPSTATE = new dict of string,string
var f = FileStream.open("/etc/rc.d/PUPSTATE","r")
var a = new array of char[128]
while f.gets(a) is not null /*read one line from file*/
a[a.length - 1] = 0 /*make it null-terminated*/
var s = (string)a /*caste array-of-char to a string*/
s.strip()
if s.has_prefix("#") == true do continue
var s2 = s.split("=",2) /*returns array of string*/
PUPSTATE[s2[0]] = s2[1] /*add to dictionary*/
for o in PUPSTATE.keys do print("%s = %s", o, PUPSTATE[o])
if PUPSTATE["PUPMODE"] == "12" do print "mode is 12"But, an interesting question. Is it possible to add to the above example so that variables are actually created? Yes...
var PUPMODE = PUPSTATE["PUPMODE"].to_int()
print "%d", PUPMODE
Homework exercise
Rewrite the above example so that the dictionary is not required at all. All of the variables are read from /etc/rc.d/PUPSTATE and assigned as variables in the program, with correct int and string datatypes.
Rewrite the above example so that the dictionary is not required at all. All of the variables are read from /etc/rc.d/PUPSTATE and assigned as variables in the program, with correct int and string datatypes.
Syntax note for Genie newbie
This PUPSTATE["PUPMODE"].to_int() may seem strange. The function to_int() converts a string to an integer, but you are so far familiar with seeing code like s.to_int(). However, Genie will accept anything on the left of the dot that resolves to a string. PUPSTATE["PUPMODE"] returns a string, so that's fine.
This PUPSTATE["PUPMODE"].to_int() may seem strange. The function to_int() converts a string to an integer, but you are so far familiar with seeing code like s.to_int(). However, Genie will accept anything on the left of the dot that resolves to a string. PUPSTATE["PUPMODE"] returns a string, so that's fine.
So far in this page we have worked with plain-vanilla C strings. However, there are some strings-on-steroids, known as GStrings...
The StringBuilder class
According to what I can glean from the Vala docs, strings are essentially immutable. That is, when created it occupies a certain amount of memory, and that's it. You can't just append to it, because it's sitting there in memory with other stuff "either side".Oh, but you can resize a string. These two ways are equivalent:
s:string = "abc"But the thing is, the old memory allocation has to be deleted and a new memory allocation made. This is very slow.
s = s + "xyz"
s = s.concat("xyz")
If your program has to do a lot of string resizing, especially if in a loop, then there is a special string class called StringBuilder, which does it faster.
StringBuilder is actually a frontend for the String class in Glib. When a String is instantiated, it is called a GString. The Glib docs has this to say about a GString:
A GString is similar to a standard C string, except that it grows
automatically as text is appended or inserted. Also, it stores the
length of the string, so can be used for binary data with embedded
null bytes.
Online documentation:
http://references.valadoc.org/glib-2.0/GLib.StringBuilder.html
http://library.gnome.org/devel/glib/stable/glib-Strings.html
http://library.gnome.org/devel/glib/stable/glib-Strings.html
These are the available functions:
append, append_c, append_len, append_printf, append_unichar, assign, erase, insert, prepend, prepend_c, prepend_len, prepend_unichar, printf
Summary of StringBuilder functions
The idea here is that you can create a GString of the StringBuilder
class, do all the resizing that you want, then access it as a normal C
string to be able to use any of the functions further up this page
...with care though... here is an example, then I'll explain further...| append |
Appends a string onto end of a GString |
| append_c |
Appends a byte onto end of a GString |
| append_unichar |
Converts a unicode char to UTF-8 and appends it |
| append_len |
|
| append_printf |
Appends a formatted string onto end of GString |
| assign |
Copy a string into GString, overwriting original |
| erase |
Erase part of the GString |
| insert |
Insert another string into GString |
| prepend_* | Ditto as append_*, but at start of GString |
| printf |
Inserts a formatted string into GString |
initEasy enough, but pay special attention to that last line. The string is printed, but str is not a function, it is a field of a structure. That's because a GString is actually a structure:
var b = new StringBuilder
b.append("quick fox")
b.prepend("The ")
b.insert(10,"brown ")
print b.str
GString structure
The Glib documentation (see above URL) defines a GString as a struct:
So, GString is just a wrapper, with a normal C string inside!
The Glib documentation (see above URL) defines a GString as a struct:
typedef struct {
gchar *str;
gsize len;
gsize allocated_len;
} GString;
The variable b is an instantiation of this structure, and b.str is the address of the actual string -- furthermore, that is a normal C null-terminated string.So, GString is just a wrapper, with a normal C string inside!
The reason that I mentioned you need to be careful, is you shouldn't use the normal C functions like concat() to resize the C string, use the StringBuilder functions for that. Function concat() will cause the C string to get relocated, so the GString structure would no longer be valid.
It follows that you can get the length of the GString like this:
x:long = b.len
The Regex class
Those of us with a Linux/Unix shell scripting background will be very familiar with utilities like grep and sed, that do string processing with regular expressions. Regular expressions are so useful, that Puppy Linux has a special regular expression help page -- click the "Help" entry in the menu.We can also use regular expressions in Genie. Here are online docs:
Here is a simple example...
initWe can also just look for a match and return a boolean true/false:
var r = new Regex ("jaguar|tiger|leopard")
var s = "wolf, tiger, eagle, jaguar, leopard, bear"
s = r.replace(s, s.len(), 0, "pussy")
print s
initNotice a difference where the regular expression object 'r' is defined: it can take an extra parameter. These are flags, and CASELESS means the comparison will ignore the case of letters. Here are all the possible flags:
var DB_description = "A utility to monitor serial I/O"
var r = new Regex (" system | print | printing | process | hardware | monitor",RegexCompileFlags.CASELESS)
if r.match(DB_description) do print "System category"
CASELESS,
MULTILINE,
DOTALL,
EXTENDED,
ANCHORED,
DOLLAR_ENDONLY,
UNGREEDY,
RAW,
NO_AUTO_CAPTURE,
OPTIMIZE,
DUPNAMES,
NEWLINE_CR,
NEWLINE_LF,
NEWLINE_CRLF
Examples
What follows are examples to show various string operations and usage of the functions.'contains' function
I had a need to read a string from the commandline and find out if it exists within another string, that is, is it a substring:init
var DB_nameonly = args[1]
var PKG_CAT_Desktop = " blackbox compiz desk_icon_theme_browndust desk_icon_theme_darkfire desk_icon_theme_original e16 fbpanel fluxbox fvwm gfontsel glipper gtk-chtheme gtk_theme_citrus_cut gtk_theme_fishing_the_sky gtk_theme_fishpie gtk_theme_gradient_brown gtk_theme_gradient_grey gtk_theme_m8darker gtk_theme_phacile_blue gtk_theme_polished_blue gtk_theme_stardust_zigbert gxset icewm jwm2 jwmconfig2 lxpanel metacity minixcal obconf openbox pupx rox_filer rox_filer twm wallpaper windowmaker xclipboard xclock xkbconfigmanager xlock_gui xlockmore "
var noPATTERN = " "+DB_nameonly+" "
if PKG_CAT_Desktop.contains(noPATTERN) do print "true"
'replace' function
This pretty much says it all:init
var PKG_CAT_Desktop = " blackbox compiz desk_icon_theme_browndust desk_icon_theme_darkfire desk_icon_theme_original e16 fbpanel fluxbox fvwm gfontsel glipper gtk-chtheme gtk_theme_citrus_cut gtk_theme_fishing_the_sky gtk_theme_fishpie gtk_theme_gradient_brown gtk_theme_gradient_grey gtk_theme_m8darker gtk_theme_phacile_blue gtk_theme_polished_blue gtk_theme_stardust_zigbert gxset icewm jwm2 jwmconfig2 lxpanel metacity minixcal obconf openbox pupx rox_filer rox_filer twm wallpaper windowmaker xclipboard xclock xkbconfigmanager xlock_gui xlockmore "
var new_s = PKG_CAT_Desktop.replace(" compiz ", " yabbidoo ")
print "%s", new_s
'substring' function
This returns a substring at a certain offset in a string. In this example, the offset is 56 and the length of the substring is 23. The offset starts from zero, so actually the returned substring is from the 57th character:init
var PKG_CAT_Desktop = " blackbox compiz desk_icon_theme_browndust desk_icon_theme_darkfire desk_icon_theme_original e16 fbpanel fluxbox fvwm gfontsel glipper gtk-chtheme gtk_theme_citrus_cut gtk_theme_fishing_the_sky gtk_theme_fishpie gtk_theme_gradient_brown gtk_theme_gradient_grey gtk_theme_m8darker gtk_theme_phacile_blue gtk_theme_polished_blue gtk_theme_stardust_zigbert gxset icewm jwm2 jwmconfig2 lxpanel metacity minixcal obconf openbox pupx rox_filer rox_filer twm wallpaper windowmaker xclipboard xclock xkbconfigmanager xlock_gui xlockmore "
var new_s = PKG_CAT_Desktop.substring(56,23)
print "%s", new_s
(c) Copyright 2008,2009 Barry Kauler puppylinux.com, all reproduction rights reserved.


