Global variables: Difference between revisions

From ASCEND
Jump to navigation Jump to search
No edit summary
 
(3 intermediate revisions by the same user not shown)
Line 1: Line 1:
{{task}}
{{task}}


We would like to get rid of the current use of global variables in ASCEND, so that we can start to think about using ASCEND in multithreaded and/or embedded ways. This page will report any cases of global variables that we have found, and perhaps some discussion about how we can best eliminate them. Not all cases will be the same.
We would like to get rid of the current use of global variables in ASCEND, so that we can start to think about using ASCEND in multithreaded and/or embedded ways. This page will report any cases of global variables that we have found, and perhaps some discussion about how we can best eliminate them. Not all cases will be the same. Globals make some aspects of code reading easier (not crowding C arg lists with context pointers) and some harder: when complex side effects are in play. We try hard to avoid any usage of globals with complex side effects.


== Ways for removing global variables ==
== Ways for removing global variables ==
Line 14: Line 14:
* adding mutex constraints (so that they can only be accessed once at a time)
* adding mutex constraints (so that they can only be accessed once at a time)
* migrating them to another code layer, eg into the GUI (this has already been done in some cases)
* migrating them to another code layer, eg into the GUI (this has already been done in some cases)
== Particular kinds of usage and what can be done about it ==
* Lex/Yacc C based scanners and parsers (and some still current versions of GNU bison/flex) generate code full of globals and non-threadsafe functions. It's not our job to fix this. The 'safe' thing to do is put a mutex around the parse function.
* ASCEND defined parser context flags, once properly identified, can remain global because the mutex for yacc will also protect them. Cosmetically, the 'properly identified' problem could be resolved by collecting these variables into a single well-named struct g_parser_context.
* Type library globals: most of these should be moved into a formal struct ascUniverse. More about that below.
* Instantiator tuning: most of these should be moved into a struct ascCompilerTuning.
* Memory recycle pools of small objects, tied to globals. These are tied to globals (typically file-scope static variables hidden behind allocator functions) to avoid passing pool pointers everywhere. In solvers like linsolqr, these pools should be tied to major objects instead of file global. "Too many pool objects, too short lived" is a problem that can be solved (taking mtx as an example, perhaps) by maintaining a list of idle mtx element pools; when to clear the idle pools is a minor problem. In the compiler, many pools should come and go with their Universe or with the destruction of the ascend type list and all dependent instances.
== The object lives in ascend, and a design for global use reduction ==
The most basic facts: ascend parses a set of type definitions into a self-consistent class hierarchy. These type definitions can then be used with the instantiator to create model instances with relation and variable data. These instance objects answer many queries by accessing pointers to the type definitions used to create the objects. ASCEND has a concept named UNIVERSAL (not unlike globals in C) whereby for a given type definition, only one instance will ever be constructed. Both types and instances are deeply tied to a symbol (constant strings) table. Throughout ascend, both types and symbols are heavily compared for identity by comparing pointers.
Key outcome of the above: if we wrap all the compiler globals in a context object of universal scope, we can then have at a higher (scripting) level multiple universe objects. **Objects from distinct universes cannot be compared except in string form** !
Given the above, then, ascend model type and instance data can be organized into something like the following. I will use c++ notation, but the implementation will be C. When in the parser, a single global (the universe pointer for the parsing to operate within) is needed and a mutex.
<source lang="c">
class ascUniverse {
public:
// functions only
private:
int auid; // serial number of this universe
struct ascUniverse *next;
// struct Symtab symtab;
// memory pools for statement, vlist, etc, etc
// struct Simlist simlist;
// struct Library library
// etc
static struct ascUniverse * g_universe_list;
};
</source>
Many parts within the compiler will need only a pointer to the piece of the universe where their data lives; we should not be passing universe pointers everywhere in the compiler. Implementing all this correctly as a refactorization is extremely tedious and should only be done with a test suite and automated refactoring tools.


== Global variables in libascend.so ==
== Global variables in libascend.so ==

Latest revision as of 18:41, 6 July 2012

This article is about planned development or proposed functionality. Comments welcome.

We would like to get rid of the current use of global variables in ASCEND, so that we can start to think about using ASCEND in multithreaded and/or embedded ways. This page will report any cases of global variables that we have found, and perhaps some discussion about how we can best eliminate them. Not all cases will be the same. Globals make some aspects of code reading easier (not crowding C arg lists with context pointers) and some harder: when complex side effects are in play. We try hard to avoid any usage of globals with complex side effects.

Ways for removing global variables

A number of options exist:

  • keeping them. This is appropriate in a very limited set of situations, such as for data that has been loaded from a configuration file when the program started.
  • groupin them into a top-level data structure. We imagine structures like "library", "simulation" and "system" could be created that could hold most global variables.
  • passing them. Where global variables have been used as a convenience to avoid having to expand function parameter lists, we can just change to passing them as parameters.
  • converting them to #defines. May be appropriate for certain constants.
  • converting them to thread-local variables (may need to assess implication for embedded applications)
  • adding mutex constraints (so that they can only be accessed once at a time)
  • migrating them to another code layer, eg into the GUI (this has already been done in some cases)

Particular kinds of usage and what can be done about it

  • Lex/Yacc C based scanners and parsers (and some still current versions of GNU bison/flex) generate code full of globals and non-threadsafe functions. It's not our job to fix this. The 'safe' thing to do is put a mutex around the parse function.
  • ASCEND defined parser context flags, once properly identified, can remain global because the mutex for yacc will also protect them. Cosmetically, the 'properly identified' problem could be resolved by collecting these variables into a single well-named struct g_parser_context.
  • Type library globals: most of these should be moved into a formal struct ascUniverse. More about that below.
  • Instantiator tuning: most of these should be moved into a struct ascCompilerTuning.
  • Memory recycle pools of small objects, tied to globals. These are tied to globals (typically file-scope static variables hidden behind allocator functions) to avoid passing pool pointers everywhere. In solvers like linsolqr, these pools should be tied to major objects instead of file global. "Too many pool objects, too short lived" is a problem that can be solved (taking mtx as an example, perhaps) by maintaining a list of idle mtx element pools; when to clear the idle pools is a minor problem. In the compiler, many pools should come and go with their Universe or with the destruction of the ascend type list and all dependent instances.

The object lives in ascend, and a design for global use reduction

The most basic facts: ascend parses a set of type definitions into a self-consistent class hierarchy. These type definitions can then be used with the instantiator to create model instances with relation and variable data. These instance objects answer many queries by accessing pointers to the type definitions used to create the objects. ASCEND has a concept named UNIVERSAL (not unlike globals in C) whereby for a given type definition, only one instance will ever be constructed. Both types and instances are deeply tied to a symbol (constant strings) table. Throughout ascend, both types and symbols are heavily compared for identity by comparing pointers.

Key outcome of the above: if we wrap all the compiler globals in a context object of universal scope, we can then have at a higher (scripting) level multiple universe objects. **Objects from distinct universes cannot be compared except in string form** !

Given the above, then, ascend model type and instance data can be organized into something like the following. I will use c++ notation, but the implementation will be C. When in the parser, a single global (the universe pointer for the parsing to operate within) is needed and a mutex.

class ascUniverse {
public:
// functions only
private:
int auid; // serial number of this universe
struct ascUniverse *next;
// struct Symtab symtab;
// memory pools for statement, vlist, etc, etc
// struct Simlist simlist;
// struct Library library
// etc
static struct ascUniverse * g_universe_list;
};

Many parts within the compiler will need only a pointer to the piece of the universe where their data lives; we should not be passing universe pointers everywhere in the compiler. Implementing all this correctly as a refactorization is extremely tedious and should only be done with a test suite and automated refactoring tools.

Global variables in libascend.so

The main place where global variables are a problem for ASCEND is in libascend, our core library include the ASCEND parser/compiler and evaluation routines, but hopefully excluding the solvers.

Below is a list of globals generated using GNU nm.

Invalid language.

You need to specify a language like this: <source lang="html">...</source>

Supported languages for syntax highlighting:

a4c, abap, abc, abnf, actionscript, ada, agda, alan, algol, ampl, amtrix, applescript, arc, arm, as400cl, ascend, asciidoc, asp, aspect, assembler, ats, autohotkey, autoit, avenue, awk, ballerina, bat, bbcode, bcpl, bibtex, biferno, bison, blitzbasic, bms, bnf, boo, c, carbon, ceylon, charmm, chill, chpl, clean, clearbasic, clipper, clojure, clp, cmake, cobol, coffeescript, coldfusion, conf, cpp2, critic, crk, crystal, cs_block_regex, csharp, css, d, dart, delphi, diff, dockerfile, dts, dylan, ebnf, ebnf2, eiffel, elixir, elm, email, erb, erlang, euphoria, exapunks, excel, express, factor, fame, fasm, felix, fish, fortran77, fortran90, frink, fsharp, fstab, fx, gambas, gdb, gdscript, go, graphviz, haml, hare, haskell, haxe, hcl, html, httpd, hugo, icon, idl, idlang, inc_luatex, informix, ini, innosetup, interlis, io, jam, jasmin, java, javascript, js_regex, json, jsp, jsx, julia, kotlin, ldif, less, lhs, lilypond, limbo, lindenscript, lisp, logtalk, lotos, lotus, lua, luban, makefile, maple, markdown, matlab, maya, mercury, meson, miranda, mod2, mod3, modelica, moon, ms, msl, mssql, mxml, n3, nasal, nbc, nemerle, netrexx, nginx, nice, nim, nix, nsis, nxc, oberon, objc, ocaml, octave, oorexx, org, os, oz, paradox, pas, pdf, perl, php, pike, pl1, plperl, plpython, pltcl, po, polygen, pony, pov, powershell, pro, progress, ps, psl, pure, purebasic, purescript, pyrex, python, q, qmake, qml, qu, r, rebol, rego, rexx, rnc, rpg, rpl, rst, ruby, rust, s, sam, sas, scad, scala, scilab, scss, shellscript, slim, small, smalltalk, sml, snmp, snobol, solidity, spec, spn, sql, squirrel, styl, svg, swift, sybase, tcl, tcsh, terraform, tex, toml, tsql, tsx, ttcn3, txt, typescript, upc, vala, vb, verilog, vhd, vimscript, vue, wat, whiley, wren, xml, xpp, yaiff, yaml, yaml_ansible, yang, zig, znn

Static variables

Another place where quasi-global variables can occur is as static variables within functions. It needs to be assessed whether the above listing includes those types of variables.