Changes between Initial Version and Version 1 of Sysel/Ideas


Ignore:
Timestamp:
2010-05-20T19:54:26Z (15 years ago)
Author:
Jiri Svoboda
Comment:

Move ideas to a separate article

Legend:

Unmodified
Added
Removed
Modified
  • Sysel/Ideas

    v1 v1  
     1= Ideas for Sysel =
     2
     3This article currently serves several purposes. First as a memo, not to forget ideas and elaborations. Second as a temporary source of information for anyone who wants to learn about plans for Sysel. Third by sharing the plans to allow discussion and brainstorming. Your comments and ideas are appreciated!
     4
     5Unless stated otherwise, the ideas presented here are planned for inclusion, although they are likely to evolve as the implementation progresses.
     6
     7== Code organization ==
     8
     9Sysel shall employ ''packages'' and ''modules''. Together, these two constructs provide full information about organization of the codebase and allow for a certain degree of freedom in how finely the code is partitioned, both in terms of namespace and code volume.
     10
     11=== Packages ===
     12
     13Packages provide two main features: a namespace and visibility controls. Packages thus provide a greater level of isolation than mere classes and allow safe composition of code developed by different (uncoordinated) teams. Packages can have a well defined API/ABI and can be delivered in compiled form via libraries. Each package has a name which must be fully qualified.
     14
     15Within a package all symbol references only need to be qualified relative to the package. To reference symbols outside of the current package they must be either imported or the reference must be fully qualified. (TODO: Should we enforce explicit import of all symbols?) Symbols can only be imported individually or in a qualified manner. This ensures that there can be no collisions of symbols from different namespaces (which need not be under the control of the same entity). When importing symbols the symbols being imported must be specified using their fully qualified names.
     16
     17=== Modules ===
     18
     19Modules provide a complementary and finer-grained means of decomposition. Usually each source file corresponds to exactly one module. For each module we define its (unqualified) name and fully qualified name of the package it belongs to (which 'anchors' it in the code base). Conversely, each package specifies all modules it consists of. Consequently, for each module we can determine which package it belongs to and for each package we can determine all modules (and thus all symbols) it consists of.
     20
     21As we explained, modules allow the source code to be broken into separate files and at the same time tie it together in a formal manner. When building a package or program, there is thus no need to specify all its source files informally in a makefile. It is sufficient to point the compiler to directories where it should look for source files and tell it which package we want built.
     22
     23Modules do not represent a namespace. Any symbols defined or imported in one module will be accessible (unqualified) in any other module within the same package. Names of global symbols in all modules of a package must therefore be coordinated. Note that due to object-oriented nature of the language there are usually not very many global symbols defined in a package and also packages are assumed to be under the control of a single entity.
     24
     25Definitions of classes can be split across multiple modules (but not packages). Thus large classes can be split accross multiple source files.
     26
     27== Dynamic linking ==
     28
     29It should be possible to use, with similar simplicity and the same level of static type checking, not only ''compulsory libraries'', but also ''optional libraries'' and ''plugin libraries''.
     30
     31Compulsory libraries are those required every time the executable is invoked (equivalent to `gcc -lname`). Optional libraries are only loaded once the application touches some symbol from the library. This is a very useful feature that allows building binaries with all optional dependencies enabled, yet the user need not install all these libraries if they do not want to. This helps avoiding ''dependency avalanches''.
     32
     33Plugin libraries are those where multiple libraries can exist written again some common plugin interface. One possibility is to have ''packages'' implement ''package interfaces''. A package could be loaded at run time, a reference to it stored to a variable whose type is the ''package interface'' type. Then it would be possible to refer to symbols within the dynamic package using standard qualified names (e.g. `P.symbol`). This enables full static type checking / interface checking for both the implementor and user of the plugin.
     34
     35== Remote objects ==
     36
     37=== Basics ===
     38
     39HelenOS IPC is usually employed in an RPC-like style. Remote objects would support asynchronous messaging in the language itself. Remote object classes (and interfaces) form a separate hierarchy of inheritance to the ''local'' classes and interfaces. Remote interfaces are equivalent to IPC interfaces now usually defined in HelenOS in `uspace/lib/c/includ/ipc`. They would naturally support (multiple) inheritance. Servers contain remote classes which implement these interfaces.
     40
     41When a client wants to use some service, they are given a reference to a remote object. This reference identifies not only the server which we talk to, but possibly also the individual resource within the server that we are accessing. For a contrived example, a console server might provide the two interfaces:
     42
     43{{{
     44interface IConsole, remote is
     45        fun GetVC(vc_index : int) : IVC;
     46end
     47
     48interface IVC, remote is
     49        fun GotoXY(x, y : int);
     50        fun Write(s : string);
     51end
     52}}}
     53
     54When we invoke the GetVC() method, the console server will pass us a reference to the remote object implementing the requested VC. Then we can work with this particular VC using that reference:
     55
     56{{{
     57var Con : IConsole;
     58var VC : IVC;
     59
     60C = NameService.GetConnection("console") as IConsole;
     61VC = C.GetVC(2);
     62VC.GotoXY(10, 10);
     63VC.Write("Hello World!");
     64}}}
     65
     66Connection creation and termination, as well as transaction management (identifying the objects being worked with) is automatically handled by the language run-time. Also handled automatically is the creation of threads and fibrils within a server. A server can potentially handle any number of parallel requests (though it might be possible to limit this with some quota, if required). Concurrent access to remote objects is possible (and often desired).
     67
     68=== Remote invocation ===
     69
     70When a method of a remote object is invoked, the method ID and its parameters are serialized and the resulting message is sent to the server. On the server the method ID and arguments are de-serialized and the implementation of the method is invoked. When the method returns, the return value (and possibly output arguments) are serialized and sent back to the client. At the client the return value(s) are de-serialized and returned to the caller.
     71
     72Some notes:
     73 * Multiple threads/fibrils may use the same remote object in parallel without fear of blocking each other (as long as the server is properly implemented)
     74 * Stateful services can be implemented by the server handing out state objects (such as open-file object on a file server).
     75
     76=== Promises ===
     77
     78[http://en.wikipedia.org/wiki/Futures_and_promises Promises] can be used to express asynchronous behavior and potentially allow for [http://en.wikipedia.org/wiki/Promise_pipelining#Promise_pipelining promise pipelining] (a form of optimization). In our case it would suffice to have a specialized form of promise, one that promises some data to be delivered from a remote object. Promises would be declared using a prefix type operator `future`.
     79
     80As long as the data received from a remote object stays in a type that is `future`, it is handled in an asynchronous fashion. Once the data is converted to a non-future type, the execution blocks until the data is received.
     81
     82Example:
     83
     84{{{
     85interface IAsyncIO is
     86        fun AReadBlock(addr : int) : future Block;
     87end
     88}}}
     89
     90{{{
     91fun ReadBlocksParallel(start_addr, count : int) : Block[] is
     92        var fblock : (future Block)[];
     93
     94        for i in range(0, count) do
     95                -- This does not block
     96                fblock[i] = AReadBlock(start_addr + i);
     97        end
     98
     99        -- All reads are now being executed in parallel.
     100
     101        -- Each array element is implicitly converted from future Block to Block.
     102        -- This blocks until all data has been received.
     103        return fblock;
     104end
     105}}}
     106
     107== String language specification ==
     108
     109It has been suggested by Pavel Rimsky that very often string literals in a program contain data in some machine readable language (e.g. format strings, SQL statements) or references to external resources. It might be useful to be able to somehow specify this in the program, so that external tools could recognize and work with these for purposes such as syntax checking, refactoring, etc.
     110
     111Note: That means ''identifying'' the language the string contains. Defining any ''properties'' (e.g. syntax) of the language the string contains is out of scope!
     112
     113One typical example here is a formatting function. The format string argument is in a well-defined language. Here it would be useful to specify language of the formal argument. With any use of such function the compilation tools could try to verify the real argument. Similarly we might to specify language for a member variable.
     114
     115A different approach is specifying language of a string literal in situ. This is reminiscent of (X)HTML which allows embedding pieces of code written in different languages (e.g. CSS, ECMAscript) while specifying the external language using its MIME type, or language constructs such as "extern C".
     116
     117Both approaches could be combined.
     118
     119TODO: Consider where language annotations would be useful and how they should be realized lexically and syntactically. (Must look pretty!)
     120
     121== Miscellaneous ideas ==
     122
     123These ideas are considered for inclusion (but need not be included).
     124
     125=== Member pointers ===
     126
     127Delegates identify the object instance and method to be called (but not the arguments). Conversely, member pointers identify the method to be called, but not the arguments and not the object instance (It can be invoked on any object which is instace of a given class). This feature comes from C++.
     128
     129=== True inner classes ===
     130
     131A true inner class is non-static in the sense that any instance of this class implicitly contains a reference to some instance of the outer class. Thus the inner class is constructed in non-static context (in context of an object instance) and the outer object can be referenced via a keyword.
     132
     133=== Output function arguments ===
     134
     135Semantically equivalent to additional return values of a function. A convenient way to return multiple values (especially since Sysel does not have tuples).
     136
     137=== Built-in associative arrays ===
     138
     139Maps and sets are so commonly used and so immensely useful that it might be worth incorporating into the langauge core. This could bring greater ease of use and optimization opportunities.