Writing drivers for KolibriOS

From KolibriOS wiki
Revision as of 14:27, 19 August 2013 by Hidnplayr (talk | contribs)
Jump to navigation Jump to search

Driver development for KolibriOS

Introduction

Warning 1. Before writing a driver for KolibriOS think if you can gain your aims by application programming interfaces (API), in particular - hardware interface functions 41-46 and 62. There are several reasons to prefer an application to a driver. First of all, a bug in an application can hang this application only, and the operating system will continue working. But a bug in a driver can cause the entire system to crash. Though drivers are very critical for system stability it's difficult to debug them. Developing an application you can hunt for bugs using mtdbg, which is quite an advanced debugger; but you can't use it to debug a driver. Sometimes the embedded Bochs debugger would do, but, of course, it is useless if you work with real hardware. You can only send messages to the KolibriOS debug message board, which has many disadvantages.

Warning 2. It's clear that every driver depends on the kernel architecture very much. But KolibriOS kernel is modified several times a week. Of course, most of changes don't make sense to the driver subsystem, but sometimes important system functions exported to drivers are changed or destroyed, or new functions are created. So if you compile the code included in this article it's possible that it won't work "out of the box". So, read the text carefully - I'll try to note all possible reasons of future problems and the modifications necessary to fix them. My code is written for svn.450 revision, which is the newest one at the moment (so without modification it won't work in 0.6.5.0 distribution).

As a rule, the only reason we write drivers for is hardware interface. A real driver includes a huge mass of hardware-dependent code which has almost nothing to deal with the subject of this article - basic principles of KolibriOS kernel driver subsystem. The process of driver development is shown on the following example: we'll write a driver which catches and writes down all queries applications send to the filesystem, and a control application which receives data from the driver and displays them. FASM is used as the main development tool.

The Driver

KolibriOS kernel sources include a basic driver skeleton - sceletone.asm (it is placed in kernel/drivers in KolibriOS 0.6.5.0 kernel sources distribution or in svn://kolibrios.org/kernel/trunk/drivers on KolibriOS SVN repository). Let's have a look at it (it is svn.450 version):

;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;
;;                                                              ;;
;; Copyright (C) KolibriOS team 2004-2007. All rights reserved. ;;
;; Distributed under terms of the GNU General Public License    ;;
;;                                                              ;;
;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;
;driver sceletone

format MS COFF

include 'proc32.inc'
include 'imports.inc'

First of all (after several usual comments) we should tell the compiler which binary file format to use. Drivers must be compiled in COFF object file format (“MS” in the middle means that the driver uses a special extended version of COFF created by Microsoft. It allows to use “writeable” attributes for sections).

Then the driver plugs in additional files. proc32.inc includes macros for standard procedure definitions and calls (proc/endp, stdcall/ccall/invoke/cinvoke, local for local variables declaration), it is placed in the same folder as sceletone.asm. This file is optional, it's up to you to use it or not; in this article proc32.inc is used in order to make the code simlplier (generally, macros can be a bad idea if you need maximal efficiency, but it's not this article's subject).

imports.inc includes definitions for all exported kernel functions. Have a look at it, there's nothing special there, only a mess of trivial constructions. Actually (as it takes place when loading the driver) the list of all exported functions and data is placed in core/exports.inc (kernel_export label), so if you export something feel free to change this file. But don't forget to submit your changes to imports.inc to take care of others.

Be careful! A compatibility issue can occur here. If your driver uses a function which isn't declared in current kernel it will refuse to load the driver and write “unresolved” and the function name on the debug board.

Well, let's go forward.

OS_BASE       equ 0x80000000
new_app_base  equ 0x00000000
PROC_BASE     equ OS_BASE+0x0080000

OS_BASE means the kernel loading address. For the “old” kernel (for example, version 0.6.5.0) it is 0, for “flat kernel” (current kernel) it is 0x80000000.

The new_app_base constant means the linear address where the applications are loaded: all applications are loaded into the same address, but intersection doesn't occur because every process has its own page table and every application is sure that it's loaded into address 0. In ring 3 CS/DS/ES/SS selectors have new_app_base base, and in ring 0 (kernel and drivers) - base 0. So, to convert an application address into a kernel pointer it's necessary to add new_app_base to it (believe me if you don't understand it). new_app_base can cause an incompatibility issue: in 0.6.5.0 this constant is 0x60400000, in svn.450 - 0x80000000, and in “flat” kernel (current kernel) it is 0 (it is called “flat” because it uses a flat memory model). How can we get real values of OS_BASE and new_app_base for particular kernel? It's elementary - they're declared in const.inc (from kernel sources), so it's enough to find them there. And the KolibriOS memory map is placed in memmap.inc file from the kernel sources distribution.

The third constant (PROC_BASE) means nothing in our case, it isn't used at all.

struc IOCTL
{  .handle     dd ?
   .io_code    dd ?
   .input      dd ?
   .inp_size   dd ?
   .output     dd ?
   .out_size   dd ?
}

virtual at 0
   IOCTL IOCTL
end virtual

It's simply a structure definition (speculations with VIRTUAL are common for FASM).

public START
public service_proc
public version

We've already imported necessary functions from kernel (imports.inc). And now we let the kernel know about the driver. Let's start from the end. The version variable which is declared in the following text is not actually the driver version. It is the version of the interface this driver works with. In particular, two version codes are encoded in one DWORD. The lower WORD makes no sense in current kernel version, but you should put there the version number of the interface which is “native” for the driver. The upper WORD means minimal version the driver requires to work with. It must be in the interval between DRV_COMPAT and DRV_CURRENT constants which are defined in core/dll.inc file in kernel sources distribution. In 0.6.5.0 both of them are 3. But unil svn.450 the interface has been changed and the constants have been made equal to 4. What is this stuff necessary for? It's made to solve the following problem. All changes in driver subsystem can be divided into several groups: 1) full or partial modification of basic principles; 2) deletion of one or more exported kernel functions; 3) function interface modification (parameters in stack/register, additional arguments, arguments order, etc); 4) new functions. As a rule, in the situations 1-3 you'll have to change your drivers by hand. But it's not a very good thing to recompile all drivers after adding a new kernel function though they don't need it. So the kernel supports loading “not very old” drivers.

version   dd 0x00030003

The driver skeleton is made for version 3, so it won't work with current kernel. It hasn't been updated since 0.6.5.0 (besides the copyright), so new_app_base and version are old. By the way, keep in mind that the upper WORD consists of the first 3 bytes of DWORD, and the lower one consists of the second 3 bytes because bytes in WORD and words in DWORD are placed in reverse order (of course, I'm sure you know it, but just in case...)

The START procedure is called by the system when loading the driver and when closing it. In the first situation it initializes the driver, in the second - finishes its work. You don't need to export the service_proc procedure - there are other methods to let the kernel know about it. We'll talk about it later. And finally, the last portion of constants in sceletone.asm:

DEBUG      equ 1

DRV_ENTRY  equ 1
DRV_EXIT   equ -1
STRIDE     equ 4    ;size of row in devices table

The first of them turns on debug output in the blocks if DEBUG/end if. DRV_ENTRY and DRV_EXIT define possible values of the START procedure argument. The last constant is necessary for beauty. Now we see the first executable code. The line

section '.flat' code readable align 16

means the very thing which is written. Then come more interesting things.

proc START stdcall, state:dword
   cmp [state], 1
   jne .exit
.entry:
   if DEBUG
      mov esi, msgInit
      call SysMsgBoardStr
   end if
   stdcall RegService, my_service, service_proc
   ret
.fail:
.exit:
   xor eax, eax
   ret
endp

This is the code of the procedure of initialization/finalization. When loading the driver it is called with argument DRV_ENTRY = 1 and it must return a non-zero value if there are no problems. When halting the system START is called with argument DRV_EXIT = -1. In particular case or driver doesn't work with any hardware, so there's no hardware initialization or finalization code. The procedure includes only minimally required actions necessary to complete the driver loading. It is registration. The RegisterService function is exported by kernel. It accepts two arguments: driver name (up to 16 symbols including the terminating zero) and the pointer to the procedure handling the I/O. RegisterService returns 0 when registration is failed or a non-zero registered handler if everything is well.

By the way, you may ask how to learn what a particular kernel function does. For example, we need to allocate several pages of kernel memory. To get the reference we open core/exports.inc in kernel sources, look through exported names (they're quite human-readable) and notice szKernelAlloc. We page down until kernel_export label and look for szKernelAlloc - it is assigned to a procedure kernel_alloc. Now look for the implementation of this procedure - it is placed in core/heap.inc. There are no comments, but there's a definition which makes clear that this function accepts one argument called size of DWORD type. It's clear that kernel_alloc allocates as much kernel memory as the single argument tells to. And the very first lines of its code show that the size of allocated memory is aligned up to 4096 (i. e. the size of a page), so it's clear that it allocates an integer number of pages, and size is given in bytes. After such a lyric digression let's leturn to our driver.

After the described code of startup procedure these lines follow. It's service_proc procedure - the queries handler.

handle    equ IOCTL.handle
io_code   equ IOCTL.io_code
input     equ IOCTL.input
inp_size  equ IOCTL.inp_size
output    equ IOCTL.output
out_size  equ IOCTL.out_size

align 4
proc service_proc stdcall, ioctl:dword

;  mov edi, [ioctl]
;  mov eax, [edi+io_code]

   xor eax, eax
   ret
endp

restore   handle
restore   io_code
restore   input
restore   inp_size
restore   output
restore   out_size

service_proc is called when somebody tries to communicate with the driver. It can be another driver (a driver can call itself through the I/O mechanism, but it's unnecessary) which has got the handler somewhere and called ServiceHandler, it can be even Its Majesty the Kernel (srv_handler, srv_handlerEx from core/dll.inc), or, finally, an application using function 68.17 (it can get the handler when loading the driver with 68.16). A zero return value means success, a non-zero value - an error.

First of all we define shortened names for the members of structure which describes the query. In the handle field the driver handler is placed, io_code - the DWORD-typed query identifier, the other fields should be clear. The returned value is passed to the code which has called the driver (another driver/the kernel/an application). Finally the program restores the values which have been resigned to shortened names of structure's fields. In the particular case it is useless, but in more complex drivers such common name as “input” may be used many times.

The next portion of code in sceletone.asm looks for the specified hardware on the PCI bus. We don't need it, so we shan't describe it here. You'll certainly get dealt with it if you like.

Now we can start writing our own driver. The beginning is trivial.

;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;
;;                                                              ;;
;; Copyright (C) KolibriOS team 2004-2007. All rights reserved. ;;
;; Distributed under terms of the GNU General Public License    ;;
;;                                                              ;;
;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;
;FileMon: driver part

format MS COFF

include 'proc32.inc'
include 'imports.inc'

Write for svn.450 (for 0.6.5.0 you should write new_app_base equ 0x60400000):

OS_BASE       equ 0;
new_app_base  equ 0x80000000

Several definitions:

struc IOCTL
{  .handle     dd ?
   .io_code    dd ?
   .input      dd ?
   .inp_size   dd ?
   .output     dd ?
   .out_size   dd ?
}

virtual at 0
   IOCTL IOCTL
end virtual

public START
public version

DRV_ENTRY  equ 1
DRV_EXIT   equ -1

section '.flat' code readable align 16

There's nothing special in this code. But now we're starting to implement the functions our driver carries. Wait a little! Before putting fingers on keyboard let's think what we want our driver to do.

Our driver will accept 4 codes of I/O query. Let's describe each of them.

Code 0 is always used to get the driver version (now it is the version of the driver, but not of the driver interface). The handler of this request is not required because the kernel is not interested in driver version at all; but a driver always can be updated, and any code which works with our driver may need its version number. It can be returned in any format, in our example we'll return only a DWORD value equal to 1.

Code 1 means “start logging”, code 2 - “return the log till current moment and reset it”, code 3 - “stop logging”. The log will be stored in an internal 16K-sized buffer (we'll call the driver once a second, so it should be enough. If an overflow occurs we'll note it but extra log records will be rejected). The returned log information will have the following format: the entire size of recorded log data; a byte equal to 0 or 1, 1 means that some records haven't fit into the buffer; and, finally, the log data - an array of dynamically sized structures. The first byte of every structure contains the number of called kernel function, it will determine the latter contents of the structure.

The START procedure in our example is trivial because we don't need to initalize anything in it:

proc START stdcall, state:dword
   cmp [state], 1
   jne .exit
.entry:
   if DEBUG
      mov esi, msgInit
      call SysMsgBoardStr
   end if
   stdcall RegService, my_service, service_proc
   ret
.fail:
.exit:
   xor eax, eax
   ret
endp

Query handlers:

handle     equ  IOCTL.handle
io_code    equ  IOCTL.io_code
input      equ  IOCTL.input
inp_size   equ  IOCTL.inp_size
output     equ  IOCTL.output
out_size   equ  IOCTL.out_size

proc service_proc stdcall, ioctl:dword
        mov     edi, [ioctl]
        mov     eax, [edi+io_code]
        test    eax, eax
        jz      .getversion
        dec     eax
        jz      .startlog
        dec     eax
        jz      .getlog
        dec     eax
        jz      .endlog
        xor     eax, eax
        ret
.getversion:
        cmp     [edi+out_size], 4
        jb      .err
        mov     edi, [edi+output]
        mov     dword [edi], 1          ; version of driver
.ok:
        xor     eax, eax
        ret
.err:
        or      eax, -1
        ret
.startlog:
        mov     al, 1
        xchg    al, [bLogStarted]
        test    al, al
        jnz     .ok
        mov     [logptr], logbuf
        call    hook
        jnc     .ok
        mov     [bLogStarted], 0
        jmp     .err
.getlog:
        cli
        mov     esi, logbuf
        mov     ecx, [logptr]
        sub     ecx, esi
        add     ecx, 5
        cmp     ecx, [edi+out_size]
        jbe     @f
        mov     ecx, [edi+out_size]
        mov     [bOverflow], 1
@@:
        sub     ecx, 5
        xor     eax, eax
        xchg    al, [bOverflow]
        mov     edi, [edi+output]
        mov     [edi], ecx
        add     edi, 4
        stosb
        rep     movsb
        mov     [logptr], logbuf
        sti
        xor     eax, eax
        ret
.endlog:
        xchg    al, [bLogStarted]
        test    al, al
        jz      @f
        call    unhook
@@:
        xor     eax, eax
        ret
endp

restore   handle
restore   io_code
restore   input
restore   inp_size
restore   output
restore   out_size

Our driver doesn't require any input data, so we don't use the fields input/inp_size. In out_size field the calling code must place the size of the output buffer. If the driver is called by an application the address translations are performed by the kernel. So ioctl contains correct pointers.

We've done everything necessary for communicating with the driver subsystem, and now we can feel free to write something more interesting. We catch the filesystem kernel fuctions 6, 32, 33, 58, 70. The system call handler - int 0x40 - calls a concrete kernel function through servetable (the code of the system call handler can be found in core/syscall.inc). We can make 0x40 to call our code instead of system functions by replacing several items of this table with addresses of our handlers. The address of servetable can be got by scanning the code of int 0x40 function. Its address can be easily got from IDT, and the call command currently looks like call dword[servetable+edi*4], or in machine code - FF 14 BD <servetable> (of course nobody can guarantee that it will always stay like this, for example, edi can be potentially changed to eax).

hook:
        cli
        sub     esp, 6
        sidt    [esp]
        pop     ax      ; limit
        pop     eax     ; base
        mov     edx, [eax+40h*8+4]
        mov     dx, [eax+40h*8]
; edx contains address of i40
        mov     ecx, 100
.find:
        cmp     byte [edx], 0xFF
        jnz     .cont
        cmp     byte [edx+1], 0x14
        jnz     .cont
        cmp     byte [edx+2], 0xBD
        jz      .found
.cont:
        inc     edx
        loop    .find
        sti
        mov     esi, msg_failed
        call    SysMsgBoardStr
        stc
        ret
.found:
        mov     eax, [edx+3]
; eax contains address of servetable
        mov     [servetable_ptr], eax
        mov     edx, newfn06
        xchg    [eax+6*4], edx
        mov     [oldfn06], edx
        mov     edx, newfn32
        xchg    [eax+32*4], edx
        mov     [oldfn32], edx
        mov     edx, newfn33
        xchg    [eax+33*4], edx
        mov     [oldfn33], edx
        mov     edx, newfn58
        xchg    [eax+58*4], edx
        mov     [oldfn58], edx
        mov     edx, newfn70
        xchg    [eax+70*4], edx
        mov     [oldfn70], edx
        sti
        clc
        ret

unhook:
        cli
        mov     eax, [servetable_ptr]
        mov     edx, [oldfn06]
        mov     [eax+6*4], edx
        mov     edx, [oldfn32]
        mov     [eax+32*4], edx
        mov     edx, [oldfn33]
        mov     [eax+33*4], edx
        mov     edx, [oldfn58]
        mov     [eax+58*4], edx
        mov     edx, [oldfn70]
        mov     [eax+70*4], edx
        sti
        ret

Two additional functions:

write_log_byte:
; in: al=byte
        push    ecx
        mov     ecx, [logptr]
        inc     ecx
        cmp     ecx, logbuf + logbufsize
        ja      @f
        mov     [logptr], ecx
        mov     [ecx-1], al
        pop     ecx
        ret
@@:
        mov     [bOverflow], 1
        pop     ecx
        ret

write_log_dword:
; in: eax=dword
        push    ecx
        mov     ecx, [logptr]
        add     ecx, 4
        cmp     ecx, logbuf + logbufsize
        ja      @f
        mov     [logptr], ecx
        mov     [ecx-4], eax
        pop     ecx
        ret
@@:
        mov     [bOverflow], 1
        pop     ecx
        ret

Writing the handlers be aware that the registers are moved in a cycle in comparison with the case of calling 0x40 from an application and all pointers belong to the 3rd ring.

newfn06:
        cli
        push    [logptr]
        push    eax
        mov     al, 6           ; function 6
        call    write_log_byte
        mov     eax, ebx        ; start block
        call    write_log_dword
        mov     eax, ecx        ; number of blocks
        call    write_log_dword
        mov     eax, edx        ; output buffer
        call    write_log_dword
        pop     eax
        push    eax
        push    esi
        lea     esi, [eax+new_app_base] ; pointer to file name
@@:
        lodsb
        call    write_log_byte
        test    al, al
        jnz     @b
        pop     esi
        pop     eax
        cmp     [bOverflow], 0
        jz      .nooverflow
        pop     [logptr]
        jmp     @f
.nooverflow:
        add     esp, 4
@@:
        sti
        jmp     [oldfn06]

newfn32:
        cli
        push    [logptr]
        push    eax
        mov     al, 32          ; function 32
        call    write_log_byte
        pop     eax
        push    eax
        push    esi
        lea     esi, [eax+new_app_base] ; pointer to file name
@@:
        lodsb
        call    write_log_byte
        test    al, al
        jnz     @b
        pop     esi
        pop     eax
        cmp     [bOverflow], 0
        jz      .nooverflow
        pop     [logptr]
        jmp     @f
.nooverflow:
        add     esp, 4
@@:
        sti
        jmp     [oldfn32]

newfn33:
        cli
        push    [logptr]
        push    eax
        mov     al, 33          ; function 33
        call    write_log_byte
        mov     eax, ebx        ; input buffer
        call    write_log_dword
        mov     eax, ecx        ; number of bytes
        call    write_log_dword
        pop     eax
        push    eax
        push    esi
        lea     esi, [eax+new_app_base] ; pointer to file name
@@:
        lodsb
        call    write_log_byte
        test    al, al
        jnz     @b
        pop     esi
        pop     eax
        cmp     [bOverflow], 0
        jz      .nooverflow
        pop     [logptr]
        jmp     @f
.nooverflow:
        add     esp, 4
@@:
        sti
        jmp     [oldfn33]

newfn58:
        cli
        push    [logptr]
        push    eax
        push    ebx
        lea     ebx, [eax+new_app_base]
        mov     al, 58          ; function 58
        call    write_log_byte
; dump information structure
        mov     eax, [ebx]
        call    write_log_dword
        mov     eax, [ebx+4]
        call    write_log_dword
        mov     eax, [ebx+8]
        call    write_log_dword
        mov     eax, [ebx+12]
        call    write_log_dword
        push    esi
        lea     esi, [ebx+20]           ; pointer to file name
@@:
        lodsb
        call    write_log_byte
        test    al, al
        jnz     @b
        pop     esi
        pop     ebx
        pop     eax
        cmp     [bOverflow], 0
        jz      .nooverflow
        pop     [logptr]
        jmp     @f
.nooverflow:
        add     esp, 4
@@:
        sti
        jmp     [oldfn58]

newfn70:
        cli
        push    [logptr]
        push    eax
        push    ebx
        lea     ebx, [eax+new_app_base]
        mov     al, 70          ; function 70
        call    write_log_byte
; dump information structure
        mov     eax, [ebx]
        call    write_log_dword
        mov     eax, [ebx+4]
        call    write_log_dword
        mov     eax, [ebx+8]
        call    write_log_dword
        mov     eax, [ebx+12]
        call    write_log_dword
        mov     eax, [ebx+16]
        call    write_log_dword
        push    esi
        lea     esi, [ebx+20]           ; pointer to file name
        lodsb
        test    al, al
        jnz     @f
        lodsd
        lea     esi, [eax+new_app_base+1]
@@:
        dec     esi
@@:
        lodsb
        call    write_log_byte
        test    al, al
        jnz     @b
        pop     esi
        pop     ebx
        pop     eax
        cmp     [bOverflow], 0
        jz      .nooverflow
        pop     [logptr]
        jmp     @f
.nooverflow:
        add     esp, 4
@@:
        sti
        jmp     [oldfn70]

That's all with the code. Now we supply the data:

version         dd      0x00040004
my_service      db      'fmondrv',0

msg_failed      db      'Cannot hook required functions',13,10,0

section '.data' data readable writable align 16

servetable_ptr  dd      ?

oldfn06         dd      ?
oldfn32         dd      ?
oldfn33         dd      ?
oldfn58         dd      ?
oldfn70         dd      ?

logptr          dd      ?
logbufsize = 16*1024
logbuf          rb      logbufsize

bOverflow       db      ?
bLogStarted     db      ?

The entire code of the driver is placed in fmondrv.asm which is supplied with this article. Then we compile it with

fasm fmondrv.asm

Then we can package it with kpack and decrease its size from 1850 to 750 bytes. The kernel loads kpacked files perfectly. By the way, in +4 shift a timestamp of compilation is placed in the object file. The kernel isn't interested in it, so you can fill it with zeros and save a byte of storage. To install the driver copy it to /rd/1/drivers - now it's ready for loading.

The operating application

The operating program will write text data to console and finish working on Esc. It will require the console library version 3 or newer, 0.6.5.0 is supplied with version 2, so you'll have to download the last version: http://diamondz.land.ru/console.7z. We use testcon.asm as a template (testcon2.asm is suitable, too) with the following changes: in REQ_DLL_VER we put 3 and in the import table (the myimport label) delete con_write_asciiz and add con_printf, con_kbhit, con_getch2 and, of course, write our code after “Now do some work” comment.

use32
        db      'MENUET01'
        dd      1
        dd      start
        dd      i_end
        dd      mem
        dd      mem
        dd      0
        dd      0


REQ_DLL_VER = 3
DLL_ENTRY = 1

start:
; First 3 steps are intended to load/init console DLL
; and are identical for all console programs

; load DLL
        mov     eax, 68
        mov     ebx, 19
        mov     ecx, dll_name
        int     0x40
        test    eax, eax
        jz      exit

; initialize import
        mov     edx, eax
        mov     esi, myimport
import_loop:
        lodsd
        test    eax, eax
        jz      import_done
        push    edx
import_find:
        mov     ebx, [edx]
        test    ebx, ebx
        jz      exit;import_not_found
        push    eax
@@:
        mov     cl, [eax]
        cmp     cl, [ebx]
        jnz     import_find_next
        test    cl, cl
        jz      import_found
        inc     eax
        inc     ebx
        jmp     @b
import_find_next:
        pop     eax
        add     edx, 8
        jmp     import_find
import_found:
        pop     eax
        mov     eax, [edx+4]
        mov     [esi-4], eax
        pop     edx
        jmp     import_loop
import_done:

; check version
        cmp     word [dll_ver], REQ_DLL_VER
        jb      exit
        cmp     word [dll_ver+2], REQ_DLL_VER
        ja      exit
        push    DLL_ENTRY
        call    [dll_start]

; yes! Now do some work (say helloworld in this case).
        push    caption
        push    -1
        push    -1
        push    -1
        push    -1
        call    [con_init]

Loading the driver. In case of an error write it on the console and exit.

        mov     eax, 68
        mov     ebx, 16
        mov     ecx, drivername
        int     0x40
        mov     [hDriver], eax
        test    eax, eax
        jnz     @f
loaderr:
        push    aCantLoadDriver
        call    [con_printf]
        add     esp, 4
        push    0
        call    [con_exit]
        jmp     exit
@@:

Check the driver version number by sending it query with code 0.

        and     [ioctl_code], 0
        and     [inp_size], 0
        mov     [outp_size], 4
        mov     [output], driver_ver
        mov     eax, 68
        mov     ebx, 17
        mov     ecx, ioctl
        int     0x40
        test    eax, eax
        jnz     loaderr
        cmp     [driver_ver], 1
        jnz     loaderr

Starting logging:

        mov     [ioctl_code], 1
        and     [inp_size], 0
        and     [outp_size], 0
        mov     eax, 68
        mov     ebx, 17
        mov     ecx, ioctl
        int     0x40
        test    eax, eax
        jnz     loaderr

The driver has been loaded and is logging now. Let's tell the user about it:

        push   str0
        call    [con_printf]
        add    esp, 4

Now we enter the main cycle. Once a second the program checks if there are any events. Generally, the driver can wake up the application stream itself when a filesystem event occurs, but filesystem function calls can occur very often and it's hardly useful to wake up the application for each of them. The processing of data is a long, but quite easy process (the arguments for different sysfunctions differ from each other, so we need to process every case).

mainloop:
        mov     eax, 5
        mov     ebx, 100
        int     0x40
        mov     [ioctl_code], 2
        and     [inp_size], 0
        mov     [outp_size], 1+16*1024
        mov     [output], logbuf
        mov     eax, 68
        mov     ebx, 17
        mov     ecx, ioctl
        int     0x40
        push    eax
        mov     ecx, dword [logbuf]
        mov     esi, logbuf+5
message:
        test    ecx, ecx
        jz      done
        movzx   eax, byte [esi]
        push    eax
        push    str1
        call    [con_printf]
        add     esp, 8
        lodsb
        cmp     al, 6
        jz      fn06
        cmp     al, 32
        jz      fn32
        cmp     al, 33
        jz      fn33
        cmp     al, 58
        jz      fn58
        sub     ecx, 1+4*5      ; size of log data for fn70 (excluding filename)
        lodsd
        cmp     eax, 10
        jae     fn70unk
        jmp     [fn70+eax*4]
fn70unk:
        push    dword [esi+12]
        push    dword [esi+8]
        push    dword [esi+4]
        push    dword [esi]
        push    eax
        push    str2
        call    [con_printf]
        add     esp, 6*4
        add     esi, 16
        jmp     print_name
fn70readfile:
        push    dword [esi+12]
        push    dword [esi+8]
        push    dword [esi]
        push    str3
        call    [con_printf]
        add     esp, 4*4
        add     esi, 16
        jmp     print_name
fn70readfolder:
        mov     eax, str41
        test    byte [esi+4], 1
        jz      @f
        mov     eax, str42
@@:
        push    dword [esi+12]
        push    dword [esi+8]
        push    dword [esi]
        push    eax
        push    str4
        call    [con_printf]
        add     esp, 5*4
        add     esi, 16
        jmp     print_name
fn70create:
        push    dword [esi+12]
        push    dword [esi+8]
        push    str5
        call    [con_printf]
        add     esp, 3*4
        add     esi, 16
        jmp     print_name
fn70write:
        push    dword [esi+12]
        push    dword [esi+8]
        push    dword [esi+4]
        push    str6
        call    [con_printf]
        add     esp, 4*4
        add     esi, 16
        jmp     print_name
fn70setsize:
        push    dword [esi]
        push    str7
        call    [con_printf]
        add     esp, 4*2
        add     esi, 16
        jmp     print_name
fn70getattr:
        push    dword [esi+12]
        push    str8
        call    [con_printf]
        add     esp, 4*2
        add     esi, 16
        jmp     print_name
fn70setattr:
        push    dword [esi+12]
        push    str9
        call    [con_printf]
        add     esp, 4*2
        add     esi, 16
        jmp     print_name
fn70execute:
        push    str10
        call    [con_printf]
        add     esp, 4
        lodsd
        test    al, 1
        jz      @f
        push    str10_1
        call    [con_printf]
        add     esp, 4
@@:
        lodsd
        test    eax, eax
        jz      @f
        push    eax
        push    str10_2
        call    [con_printf]
        add     esp, 8
@@:
        add     esi, 8
        jmp     print_name
fn70delete:
        push    str11
        call    [con_printf]
        add     esp, 4
        add     esi, 16
        jmp     print_name
fn70createfolder:
        push    str12
        call    [con_printf]
        add     esp, 4
        add     esi, 16
        jmp     print_name
fn58:
        sub     ecx, 1+4*4      ; size of log data for fn58 (excluding filename)
        lodsd
        test    eax, eax
        jz      fn58read
        cmp     eax, 1
        jz      fn58write
        cmp     eax, 8
        jz      fn58lba
        cmp     eax, 15
        jz      fn58fsinfo
fn58unk:
        push    dword [esi+8]
        push    dword [esi+4]
        push    dword [esi]
        push    eax
        push    str13
        call    [con_printf]
        add     esp, 5*4
        add     esi, 12
        jmp     print_name
fn58read:
        push    dword [esi+8]
        mov     eax, [esi+4]
        shl     eax, 9
        push    eax
        mov     eax, [esi]
        shl     eax, 9
        push    eax
        push    str3
        call    [con_printf]
        add     esp, 4*4
        add     esi, 12
        jmp     print_name
fn58write:
        push    dword [esi+8]
        push    dword [esi+4]
        push    str5
        call    [con_printf]
        add     esp, 3*4
        add     esi, 12
        jmp     print_name
fn58lba:
        push    dword [esi+8]
        push    dword [esi]
        push    str14
        call    [con_printf]
        add     esp, 3*4
        add     esi, 12
        jmp     print_name
fn58fsinfo:
        push    str15
        call    [con_printf]
        add     esp, 4
        add     esi, 12
        jmp     print_name
fn33:
        sub     ecx, 1+2*4      ; size of log data for fn33
        lodsd
        push    eax
        lodsd
        push    eax
        push    str5
        call    [con_printf]
        add     esp, 3*4
        push    aRamdisk
        call    [con_printf]
        add     esp, 4
        jmp     print_name
fn32:
        dec     ecx             ; only filename is logged
        push    str11
        call    [con_printf]
        push    aRamdisk
        call    [con_printf]
        add     esp, 4+4
        jmp     print_name
fn06:
        sub     ecx, 1+3*4      ; size of log data for fn06
        push    dword [esi+8]
        mov     eax, [esi+4]
        test    eax, eax
        jnz     @f
        inc     eax
@@:
        shl     eax, 9
        push    eax
        lodsd
        test    eax, eax
        jnz     @f
        inc     eax
@@:
        dec     eax
        shl     eax, 9
        push    eax
        push    str3
        call    [con_printf]
        add     esp, 4*4
        push    aRamdisk
        call    [con_printf]
        add     esp, 4
        add     esi, 8
print_name:
        push    esi
        push    str_final
        call    [con_printf]
        add     esp, 8
@@:
        lodsb
        test    al, al
        jnz     @b
        jmp     message
done:
        cmp     byte [logbuf+4], 0
        jz      @f
        push    str_skipped
        call    [con_printf]
@@:
; we has output all driver data, now check console (did user press Esc?)
        call    [con_kbhit]
        test    al, al
        jz      mainloop
        call    [con_getch2]
        cmp     al, 27
        jnz     mainloop

Tell the driver to stop logging on pressing Esc.

        mov     [ioctl_code], 3
        and     [inp_size], 0
        and     [outp_size], 0
        mov     eax, 68
        mov     ebx, 17
        mov     ecx, ioctl
        int     0x40

Close console:

        push   1
        call   [con_exit]

And finish the program:

exit:
        or     eax, -1
        int    0x40

The program data:

dll_name db '/rd/1/console.obj',0
caption db 'FileMon',0
drivername db 'fmondrv',0
aCantLoadDriver db "Can't load driver",13,10,0

str0 db 'Monitoring file system calls... Press Esc to exit',10,0

str1 db 'Fn%2d: ',0
str2 db 'unknown subfunction %d, parameters: 0x%X, 0x%X, 0x%X, 0x%X, name ',0
str3 db 'read file, starting from 0x%X, %d bytes, to 0x%X; name ',0
str4 db 'read folder (%s version), starting from %d, %d blocks, to 0x%X; name ',0
str41 db 'ANSI',0
str42 db 'UNICODE',0
str5 db 'create/rewrite file, %d bytes from 0x%X; name ',0
str6 db 'write file, starting from 0x%X, %d bytes, from 0x%X; name ',0
str7 db 'set file size to %d bytes; name ',0
str8 db 'get file attributes to 0x%X; name ',0
str9 db 'set file attributes from 0x%X; name ',0
str10 db 'execute ',0
str10_1 db '(in debug mode) ',0
str10_2 db '(with parameters 0x%X) ',0
str11 db 'delete ',0
str12 db 'create folder ',0
str13 db 'unknown subfunction %d, parameters: 0x%X, 0x%X, 0x%X, name ',0
str14 db 'LBA read sector 0x%X to 0x%X from device ',0
str15 db '(obsolete!) query fs information of ',0
aRamdisk db '/rd/1/',0
str_final db '%s',10,0
str_skipped db '[Some information skipped]',10,0

align 4
label fn70 dword
        dd      fn70readfile
        dd      fn70readfolder
        dd      fn70create
        dd      fn70write
        dd      fn70setsize
        dd      fn70getattr
        dd      fn70setattr
        dd      fn70execute
        dd      fn70delete
        dd      fn70createfolder

align 4
myimport:
dll_start       dd      aStart
dll_ver         dd      aVersion
con_init        dd      aConInit
con_printf      dd      aConPrintf
con_exit        dd      aConExit
con_kbhit       dd      aConKbhit
con_getch2      dd      aConGetch2
                dd      0

aStart          db      'START',0
aVersion        db      'version',0
aConInit        db      'con_init',0
aConPrintf      db      'con_printf',0
aConExit        db      'con_exit',0
aConKbhit       db      'con_kbhit',0
aConGetch2      db      'con_getch2',0

i_end:

align 4
ioctl:
hDriver         dd      ?
ioctl_code      dd      ?
input           dd      ?
inp_size        dd      ?
output          dd      ?
outp_size       dd      ?

driver_ver      dd      ?
logbuf          rb      16*1024+5

align 4
rb 2048 ; stack
mem:

Interrupts

If your writing a driver that works with actual hardware, it's likely that you'll want to use interrupts at one point. To use interrupts, the kernel exports the function AttachIntHandler.

    stdcall AttachIntHandler, int_number, int_handler_proc, dword 0

The int handler itself is expected to report whether the interrupt was for the expected device or not.
(This can likely be verified by checking your hardware's status register)
If the interrupt was handled by the driver, return 1 in eax, otherwise return 0.
Registers ebx, esi, edi and ebp should be preserved. At the end, a simple ret must be used (no iret)