Debian to Puppy package db conversion in Nim
I want to do some coding involving the Debian package database file. Here is one of them:
http://http.us.debian.org/debian/dists/bookworm/main/binary-amd64/Packages.xz
The pups, including EasyOS, have a utility, 'debdb2pupdb', that
converts this database into Puppy-standard database format. It
will run every time you update the database in the PPM (package
manager).
In a pup, you can find 'debdb2pupdb' in /usr/local/petget.
'debdb2pupdb' is written in BaCon language, though Dima (dimkr in the forum) rewrote it in C in woof-CE; though, the legacy-branch in woof-CE still has the BaCon version:
https://github.com/puppylinux-woof-CE/woof-CE/tree/legacy/woof-code/support
The 'debdb2pupdb.bac' in woofQ (used to build EasyOS), has a later version of 'debdb2pupdb.bac', last edited in 2017.
Anyway, I am intending to code in the Nim language in the future, so have made tentative steps to learn Nim coding. Here are some of those steps, looking at how parts of 'debdb2pupdb.bac' can be coded in Nim.
Here is one entry from 'Packages.xz':
Package: abiword
Version: 3.0.5~dfsg-1
Installed-Size: 5133
Maintainer: Jonas Smedegaard <dr@jones.dk>
Architecture: amd64
Depends: abiword-common (>= 3.0.5~dfsg-1), gsfonts, libabiword-3.0 (>= 3.0.5~dfsg), libc6 (>= 2.14), libdbus-1-3 (>= 1.9.14), libdbus-glib-1-2 (>= 0.78), libgcc-s1 (>= 3.0), libgcrypt20 (>= 1.8.0), libglib2.0-0 (>= 2.16.0), libgnutls30 (>= 3.7.0), libgoffice-0.10-10 (>= 0.10.2), libgsf-1-114 (>= 1.14.9), libgtk-3-0 (>= 3.0.0), libjpeg62-turbo (>= 1.3.1), libloudmouth1-0 (>= 1.3.3), libots0 (>= 0.5.0), libpng16-16 (>= 1.6.2-1), librdf0 (>= 1.0.17), libreadline8 (>= 6.0), librevenge-0.0-0, libsoup2.4-1 (>= 2.4.0), libstdc++6 (>= 7), libtelepathy-glib0 (>= 0.13.0), libtidy5deb1 (>= 1:5.2.0), libwmf0.2-7 (>= 0.2.8.4), libwpd-0.10-10, libwpg-0.3-3, libxml2 (>= 2.7.4), zlib1g (>= 1:1.1.4)
Recommends: abiword-plugin-grammar, aspell-en | aspell-dictionary, fonts-liberation, poppler-utils
Description: efficient, featureful word processor with collaboration
Homepage: http://www.abisource.com/
Description-md5: 30063e6f0ad54b0bc4811f0becf40355
Tag: implemented-in::c++, interface::graphical, interface::x11,
role::program, scope::application, uitoolkit::gtk, use::editing,
use::text-formatting, works-with-format::html, works-with-format::tex,
works-with::text, x11::application
Section: editors
Priority: optional
Filename: pool/main/a/abiword/abiword_3.0.5~dfsg-1_amd64.deb
Size: 1334712
MD5sum: 918644c61e57f56a99dfe1fbd9cefc5d
SHA256: 5cad479d7a59c611a64204bbfef736daafcd5ebd9bc0e5713b48c340c945519f
Version field
Consider the "Version:" field. 'debdb2pupdb.bac' simplifies it, by cutting off the "~dfsg-1". However, I examined the code, and that simplification doesn't catch all possibilities. Here are some Version values from the Debian database file:
0+git20220815+ds-1 0.git20161021-3 0~git20201010.1.fe3a737-2
0.0~svn10-0.1+b2 0+svn9904-5 0.0svn20121225-3
...hmmm, if chop off the svn and git parts, left with just 0
some more:
1:2.1.0+debian-7 2.3.1-debian1-4+b1 1.12.0.1+debian+dfsg3-4
0.10.0+git20210628-3 1.4+svn142-12 0.11.1+really0.6.0-1
possibly could have numbers before, ex: 1.2.3+4.5debian ...just want 1.2.3
I want the version number in the Puppy-format database to be as simple as possible, so thinking of being radical, removing all text beyond the first version number.
Here is some test code in Nim. There are a few regular expression libraries to choose from, I chose "re":
Notice that "re" prefix to the string. That means the string is a regular expression. It is also treated as a "raw string", meaning that Nim will ignore characters that might otherwise have special meaning, such as "\". An alternative is "rex" which means extended regular expression.
I am playing with optimizations when compiling:
# nim c --mm:arc -d:useMalloc --passC:-flto -d:release --opt:size re1.nim
After stripping, the binary is 27KB. Here is the result of executing it:
# ./re1
0.0svn20121225-3 becomes: 0.0
1:2.1.0+debian-7 becomes: 2.1.0
0.11.1+really0.6.0-1 becomes: 0.11.1
Depends field
Moving on, the "Depends:" entry gets heavily processed by 'debdb2pupdb'. Here it is for Debian, and again after it got converted to Puppy-format by debdb2pupdb:
#Depends: abiword-common (>= 3.0.5~dfsg-1), gsfonts, libabiword-3.0 (>= 3.0.5~dfsg), libc6 (>= 2.14), libdbus-1-3 (>= 1.9.14), libdbus-glib-1-2 (>= 0.78), libgcc-s1 (>= 3.0), libgcrypt20 (>= 1.8.0), libglib2.0-0 (>= 2.16.0), libgnutls30 (>= 3.7.0), libgoffice-0.10-10 (>= 0.10.2), libgsf-1-114 (>= 1.14.9), libgtk-3-0 (>= 3.0.0), libjpeg62-turbo (>= 1.3.1), libloudmouth1-0 (>= 1.3.3), libots0 (>= 0.5.0), libpng16-16 (>= 1.6.2-1), librdf0 (>= 1.0.17), libreadline8 (>= 6.0), librevenge-0.0-0, libsoup2.4-1 (>= 2.4.0), libstdc++6 (>= 7), libtelepathy-glib0 (>= 0.13.0), libtidy5deb1 (>= 1:5.2.0), libwmf0.2-7 (>= 0.2.8.4), libwpd-0.10-10, libwpg-0.3-3, libxml2 (>= 2.7.4), zlib1g (>= 1:1.1.4)
#+abiword-common&ge3.0.5,+gsfonts,+libabiword-3.0&ge3.0.5,+libc6&ge2.14,+libdbus-1-3&ge1.9.14,+libdbus-glib-1-2&ge0.78,+libgcc-s1&ge3.0,+libgcrypt20&ge1.8.0,+libglib2.0-0&ge2.16.0,+libgnutls30&ge3.7.0,+libgoffice-0.10-10&ge0.10.2,+libgsf-1-114&ge1.14.9,+libgtk-3-0&ge3.0.0,+libjpeg62-turbo&ge1.3.1,+libloudmouth1-0&ge1.3.3,+libots0&ge0.5.0,+libpng16-16&ge1.6.2-1,+librdf0&ge1.0.17,+libreadline8&ge6.0,+librevenge-0.0-0,+libsoup2.4-1&ge2.4.0,+libstdc++6&ge7,+libtelepathy-glib0&ge0.13.0,+libtidy5deb1&ge5.2.0,+libwmf0.2-7&ge0.2.8.4,+libwpd-0.10-10,+libwpg-0.3-3,+libxml2&ge2.7.4,+zlib1g&ge1.1.4
The code that does that in 'debdb2pupdb.bac' is quite convoluted. Having a go with Nim:
Well, that was quite simple and logical. Running it:
# ./re2
+abiword-common&ge3.0.5,+gsfonts,+libabiword-3.0&ge3.0.5,+libc6&ge2.14,+libdbus-1-3&ge1.9.14,+libdbus-glib-1-2&ge0.78,+libgcc-s1&ge3.0,+libgcrypt20&ge1.8.0,+libglib2.0-0&ge2.16.0,+libgnutls30&ge3.7.0,+libgoffice-0.10-10&ge0.10.2,+libgsf-1-114&ge1.14.9,+libgtk-3-0&ge3.0.0,+libjpeg62-turbo&ge1.3.1,+libloudmouth1-0&ge1.3.3,+libots0&ge0.5.0,+libpng16-16&ge1.6.2-1,+librdf0&ge1.0.17,+libreadline8&ge6.0,+librevenge-0.0-0,+libsoup2.4-1&ge2.4.0,+libstdc++6&ge7,+libtelepathy-glib0&ge0.13.0,+libtidy5deb1&ge5.2.0,+libwmf0.2-7&ge0.2.8.4,+libwpd-0.10-10,+libwpg-0.3-3,+libxml2&ge2.7.4,+zlib1g&ge1.1.4
Yes! To keep learning Nim, I might convert the entire 'debdb2pupdb.bac'. Here are links to the 're' and 'strutils' library modules:
https://nim-lang.org/docs/re.html
https://nim-lang.org/docs/strutils.html
Tags: easy