Plan 9 from Bell Labs’s /usr/web/sources/extra/i/Design

Copyright © 2021 Plan 9 Foundation.
Distributed under the MIT License.
Download the Plan 9 distribution.


DESIGN OF "i"
	By Howard Trickey  ( howard@research.bell-labs.com)

Most of the data structures used in "i" are declared
in i.h .  That file has a number of sections, headed
by comments (// URL, //STRINTTAB, //IUTILS, etc.)
giving the name of the c file that implements the
corresponding routines.

The c files and their purpose are:

assert.c - for aborting when "this can't happen"
build.c - parses HTML and converts to "build" format
event.c - construction and debug print for "events"
file.c - fetch code specific to FILE: protocol
ftp.c - fetch code specific to FTP: protocol
gc.c - tables and code to allow walking of data structures
	by a general "live data" collector
gui.c - code to deal with window system: finding
	and maintaining the windows that are drawn in,
	reading the keyboard and mouse and delivering
	those results as events to the main program
http.c - fetch code specific to HTTP: protocol
i.c - main program: main event loop; code for placing
	and dealing with the main user interaction elements;
	code for initiating a "goto" a new url (by calling
	layout()); code to maintain history
icons.c - arrays containing user-interface icons
img.c - converts jpg and gif files to draw images,
	including dithering (if needed).
iutils.c - many routines needed by more than one
	of the other modules; image cache code;
	user options (configuration) code; common
	code to deal with getting entities via the various
	protocols (only a little bit of custom code needs
	to live in the separate files file.c, ftp.c, http.c).
layout.c - lays out one page, using iutils to fetch
	"sources" (html, imges), and then using build
	and/or img routines to get into internal form,
	then laying out in available space.  Also has
	code for drawing each of the user interface
	"controls" (entry boxes, combo boxes, etc.).
lex.c - lexical analysis of html, used by build.
strinttab.c - string-to-int lookup routine
transport.c - tables giving pointers to implementations
	of various parts of the entity fetching logic
	needed for each of the transport mechanisms
	(file, ftp, http).
url.c - implements a "url" datatype, with various
	utilities to operate on them
utils.c - memory, list, and string utilities

STARTUP

When "i" starts up, it first calls meminit, to set things
up so that emalloc() and emallocz() will allocate out
of a big "main group" pool.  This is the pool that should
hold long-lasting data.  In general, emalloc() and emallocz()
allocate out of a pool specifically dedicated to the
current process.

Next, "i" calls initdraw() to initialize Plan 9's graphics.

Then it calls iutilsinit(), which initializes all the other
"modules" (each, by convention, has an xxxinit()
function), and processes the command line and
the user's configuration file (/usr/username/lib/iconfig)
for the startup url and various user-configurable
options.

Next, "i" calls guiinit(), which finds out the containing
rio window, divides up that space into the three
regions "control" (top bar, with buttons, etc), "main"
(where the page is drawn), and "prog" (where the
progress bar goes).  It also starts up two processes
to read the mouse and keyboard, and pass on mouse
and keyboard "events" along an evchan channel
(those events are processed in the main event loop,
see below).

Next, "i" creates a "netget" thread (see netget in
iutils.c).  Netget acts as a centralized control for
fetching all of the entities needed to display
a page.  It communicates with the rest of "i" via
channels.  Netget runs forever.

Next, "i" creates a "go" thread, whose purpose is
to sit around waiting for commands to "go" somewhere,
and act on them by calling layout.  The "go" thread
runs forever.

Finally, "i" enters a main loop, where it continually
reads the event channel and acts on those events.
Many of the events can result in a new place
to "go", after acting on them (e.g., a mouse event
might be a click over a link).  So, after each event,
we check to see if there is a new place to go,
and if so, we abort the current "go", and then send
a new one along the go channel, to be acted upon by
the "go" thread.

"i" exits when a quit event is received.


TYPICAL "GO" PROCESSING

Drawing is done inside a "Frame" (see i.h).
The whole main drawing window is a Frame,
and if the HTML document is a frameset,
then the main Frame will contain kid Frames.
"i"s first job is to find out which Frame a
given "go specification" refers to, and then
it calls get() to actually fetch and render
the contents of that Frame.  get() will be
called recursively if the contents is a frameset.
get() also deals with redirections, authorization
challanges, and notifying the user of fetch
errors if the main entity for a page cannot be
fetched.

Fetching of entities (html, images, eventually
things like scripts, style sheets) is organized
around the concept of a ByteSource (see i.h).
This is a structure with one big buffer that will
eventually contain all of the bytes of the
entity, and two indices: edata is set by the
producer (transport mechanism) to say how
much of the buffer is valid; lim is set by the
consumer (lexer or image converter) to
say how much has been consumed so far.
This design (of one big buffer) was the result
of several iterations, and seems better than
a linked-list or partial-buffer approach.
The ByteSource also has a Header structure
that contains information that the transport
mechanism knows from its initial handshake
with the other end.

The get() routine starts its fetching process
by packing up (as a ReqInfo) all of the data
needed to fetch the entity, and calling
startreq() with that as argument.  The result
is a ByteSource, and if its err field is not set,
the netget thread is busy trying to fill it.
Next, it calls waitreq(), which waits for notification
of change in any active ByteSource.  As there
is only one, it should be the one just returned
by startreq().  Either an error has occurred
or the Header is now filled, and get() can
decide whether to handle a challenge, redirection,
show an error, or, if all is OK, start layout.
If all is OK, it calls layout() with the Frame
and ByteSource as arguments.

layout() is organized around processing
a list of "Source"s.  The list is initialized
with the passed-in ByteSource, and as
processing continues, things (images)
get added to it.  Each source has a type
and a ByteSource. 

The main loop of layout goes until all
the sources are "done" (either processed
completely, or some error happened;
in either case, the ByteSource resources
are cleaned up).  The loop waits for
some ByteSource to have a state change,
and acts on that change.  For instance,
if an HTML source gets more data,
we call getitems() (in build.c) to convert
whatever it can into the internal form
used by the layout engine.  If an image
source gets more data, the image converter
is called (currently the coverter only does
something if all of the data is there, but
the structure is there for incremental
handling).

The internal form of an HTML document
is a list of Item's (see i.h).  An Item is
actually a kind of variant data type,
simulated in C.  There are a number of
common data fields (defined in the Item
structure itself), but then there is a tag,
and depending on the value of the tag,
the Item can be case to one of the
particular Item types (Itext, etc.) to
get the remaining fields.  This structure
came from Charon, where space was
at a premium.  I think it is still good,
to save on the number of memory allocations
needed, but perhaps it is a little less necessary
in the Plan 9 world.

The items have generic "state" bits that come
from the layout needs dictated by the HTML
spec and conventions.  For instance, if
IFbrk is set, there is a forced line break
before the item.

The appenditems() routine in layout.c is
the main routine for taking the Item list
from build() and using it to place items
where they belong on the page.
The representation of how the page
should look is kept in a "Lay" structure
(see i.h), which organizes the items into
a doubly-linked list of "Line" structures.
Each line has a list of Items (the Items
from parsing are distributed into lines).
Before adding items to lines, the measure()
routine does as good a job as it can
at measuring the height, width and
ascent (amount above baseline) of the
items.

The heart of the layout algorithm is the
fixlinegeom() routine.  Its main job
is to break long lines, and, a serious
complication, deal with images and
tables that are supposed to "float"
on the left or right margins.  The
floats are kept in a list, and we have
to determine when it is time to put
a float on the margin, and how that
affects the current line width for
breaking purposes.

The other complicated part of layout
has to do with table layout.  The
sizetable() routine does much of the work.
It needs to make sublayouts of its own,
and thus, calls additems inside
another Lay structure.

The final part of layout is to actually
draw the lines.  The drawall() routine
in  layout.c does this.  It is pretty
straightforward, just keeping track
of the current position and walking
the lines, using the calculated dimensions
to update the current position and
drawing the items using code specific
to each type.  The most complicated
things to draw are tables (which
use the same code recursively)
and form fields, which are drawn
by drawctl() --- it is mainly complicated
because it draws all of the borders, buttons,
comboboxes, etc., using primitive
draw operations.  We specifically avoid
heavyweight "embedded windows".

Bell Labs OSI certified Powered by Plan 9

(Return to Plan 9 Home Page)

Copyright © 2021 Plan 9 Foundation. All Rights Reserved.
Comments to webmaster@9p.io.