Internal Pages

Login to access

Embedded Linux



Embedded Linux Development on KOBOT's Imaging Subsystem


This document describes the development of KOBOT's imaging subsystem and is intended to expand as the project progresses. After reading this document, you will learn:

  • Some of the basic terminology on embedded systems
  • How cross-development is done on Linux systems using GNU tools.
  • How U-Boot boots an Intel PXA based board
  • How Linux kernel is ported to an Intel PXA based board
  • KOBOT's imaging subsystem on hardware level.
  • What the Linux system running on KOBOT consists of.
  • How KOBOT's camera driver is implemented.
  • What's next.


An Overview of Embedded Systems

An embedded system is a set of hardware and software designed to be used for a specific purpose. Some examples of embedded systems are cell phones, mp3 players and such.

Figure 1: tuxPhone system architecture

The embedded system architecture above is from an open source cellular phone project called tuxPhone. The computing engine consists of a Gumstix Connex board running Linux. The RF module is a Telit GM862 that supports GPRS, voice, SMS, fax and camera. An LCD is easy to connect as the PXA255 microcontroller has an integrated LCD controller which can be drived easily through Linux. The tuxPhone is a nice example of how a general purpose hardware and an operating system can be used as the basis for a dedicated system.x


Cross-development is a key concept when developing embedded software. It is developing software on a host system with different properties(a different architecture, or an operating system) than the target system that the software is being developed for. Due to hardware limitations on the target system, it is more suitable, and sometimes mandatory, to develop the software on a more resource-rich environment. A correctly established cross-development environment is essential for the success of the embedded project.

Figure 2: An example cross-development environment

The figure above is a classical case of embedded systems development. Due to the target having no display, it's output is directed to the serial port. Also a telnet or ssh connection could be obtained through ethernet or usb. So a developer can write his software on his own machine, compile it and upload it to the target through these mediums. Even better, if there is a bootloader that lets you mount NFS filesystems, you can mount a filesystem on your host which acts as a filesystem on your target.

Embedded Linux

You've probably heard stories on uses of Linux running on routers, robots, PDAs besides it's well-known desktop and server usage. There are a few important points that should let you consider the usage of Linux on your embedded project:

  • Whichever CPU or MCU you want to use in your project, it is probably ported to that architecture as Linux is currently ranked number one operating system in the number of architectures supported.
  • It has great network support.
  • You can get community support that probably surpasses the support any proprietary operating systems provider can give. Just don't forget to give back to the community :)

Building a cross-compiler using GNU software

Although a more rigid method for building a cross-compiler will be told about when going through buildroot, the following sites can be consulted to see how a GNU based cross-compiler is built manually.



Das U-Boot

A bootloader in an Embedded Linux system is responsible for the role of the BIOS on a standard desktop PC. As stated in Documentation/arm/Booting, after power-up the bootloader must provide the following:

  1. Setup and initialize RAM
    The boot loader is expected to find and initialize all RAM that the kernel will use for volatile data storage in the system. It performs this in a machine dependent manner. (It may use internal algorithms to automatically locate and size all RAM, or it may use knowledge of the RAM in the machine, or any other method the boot loader designer sees fit.)
  2. Initialize one serial port
    The boot loader should initialize and enable one serial port on the target. This allows the kernel serial driver to automatically detect which serial port it should use for the kernel console (generally used for debugging purposes, or communication with the target.)
    As an alternative, the boot loader can pass the relevant 'console=' option to the kernel via the tagged lists specifying the port, and serial format options as described in
  3. Detect the machine type
    The boot loader should detect the machine type its running on by some method. Whether this is a hard coded value or some algorithm that looks at the connected hardware is beyond the scope of this document. The boot loader must ultimately be able to provide a MACH TYPE xxx value to the kernel. (see linuxarcharmtoolsmach-types).
  4. Setup the kernel tagged list
    The boot loader must create and initialize the kernel tagged list. A valid tagged list starts with ATAG CORE and ends with ATAG NONE. The ATAG CORE tag may or may not be empty. An empty ATAG CORE tag has the size field set to '2' (0x00000002). The ATAG NONE must set the size field to zero.
    Any number of tags can be placed in the list. It is undefined whether a repeated tag appends to the information carried by the previous tag, or whether it replaces the information in its entirety; some tags behave as the former, others the latter.
    The boot loader must pass at a minimum the size and location of the system memory, and root filesystem location. Therefore, the minimum tagged list should look:

The tagged list should be stored in system RAM.
The tagged list must be placed in a region of memory where neither the kernel decompressor nor initrd 'bootp' program will overwrite it. The recommended placement is in the first 16KiB of RAM.
  1. Calling the kernel image
    There are two options for calling the kernel zImage. If the zImage is stored in flash, and is linked correctly to be run from flash, then it is legal for the boot loader to call the zImage in flash directly.

This is where Das U-Boot, also called the universal bootloader, comes into play. It is one of the most popular bootloaders in the Embedded Linux scene(with the other being Redboot). It supports a wide range of CPUs, and has a well-established user base. Now, let's jump into the code and see how U-Boot performs those listed tasks for the PXA255 microcontroller, which is also the MCU used in KOBOT.

        /* Step 1 - Enable CP6 permission */
        mrc     p15, 0, r1, c15, c1, 0  @ read CPAR
        orr     r1, r1, #0x40
                mcr     p15, 0, r1, c15, c1, 0
        CPWAIT  r1

        /* Step 2 - Mask ICMR & ICMR2 */
        mov     r1, #0
        mcr     p6, 0, r1, c1, c0, 0    @ ICMR
        mcr     p6, 0, r1, c7, c0, 0    @ ICMR2

        /* turn off all clocks but the ones we will definitly require */
        ldr     r1, =CKEN
        ldr     r2, =( CKEN6_FFUART | CKEN9_OST | CKEN20_IM | CKEN22_MEMC )
        str     r2, [r1]

After going through the reset sequence, the critical initialization is performed. This part consists of enabling coprocessor register 6 so that software can use power and clocking modes. Afterwards, the ICMR registers are cleared so that no interrupts are left pending. Then all the clocks except the ones that will be used are disabled. The clocks that we are going to need are for FFUART(Full Function UART), watchdog, idle mode and memory controller.

        ldr             r0,     =GPSR0
        ldr             r1,     =CFG_GPSR0_VAL
        str             r1,   [r0]
        ldr             r0,     =GPCR0
        ldr             r1,     =CFG_GPCR0_VAL
        str             r1,   [r0]

        ldr             r0,     =GPDR0
        ldr             r1,     =CFG_GPDR0_VAL
        str             r1,   [r0]

        ldr             r0,     =GAFR0_L
        ldr             r1,     =CFG_GAFR0_L_VAL
        str             r1,   [r0]

        ldr             r0,     =GAFR0_U
        ldr             r1,     =CFG_GAFR0_U_VAL
        str             r1,   [r0]

        ;; The init sequence for only 1 GPIO pin is shown above

        ldr             r0,     =PSSR
        ldr             r1,     =CFG_PSSR_VAL
        str             r1,     [r0]


        /* Disable the peripheral clocks, and set the core clock frequency */

        /* Write the preferred values for L, N, PPDIS, and CPDIS to CCCR */
        ldr     r1, =CCCR
        ldr     r2, =CCCR_VAL
        str     r2, [r1]

        /* Set CLKCFG[F] and turn on turbo mode */
        mrc     p14, 0, r2, c6, c0, 0
        orr     r2, r2, #0x3    /* Turbo, freq. change */
        mcr     p14, 0, r2, c6, c0, 0

        /* Re-read CCCR then check cp14,6 until it says it's set */
        ldr     r0, [r1]

	/* Write MSC0, MSC1, MSC2 */
	ldr	r3, =MSC0		/* Configures /CS0 and /CS1 */
	ldr	r2, =0x033C35D8		/* From P30-PXA270 design guide */
	str	r2, [r3]
	ldr	r2, [r3]		/* When programming a different memory type in an MSC register, ensure that the new value has 
					 * been accepted and programmed before issuing a command to that memory. To do this, read the 
					 * MSC register before accessing the memory. */

	ldr	r3, =MEMC_BASE

	/* MECR: Memory Expansion Card Register                             */
	ldr     r2, =CFG_MECR_VAL
	str     r2, [r3, #MECR_OFFSET]
	ldr	r2, [r3, #MECR_OFFSET]

	/* MCMEM0: Card Interface slot 0 timing                             */
	ldr     r2, =CFG_MCMEM0_VAL
	str     r2, [r3, #MCMEM0_OFFSET]
	ldr	r2, [r3, #MCMEM0_OFFSET]

	/* MCMEM1: Card Interface slot 1 timing                             */
	ldr     r2, =CFG_MCMEM1_VAL
	str     r2, [r3, #MCMEM1_OFFSET]
	ldr	r2, [r3, #MCMEM1_OFFSET]

	/* MCATT0: Card Interface Attribute Space Timing, slot 0            */
	ldr     r2, =CFG_MCATT0_VAL
	str     r2, [r3, #MCATT0_OFFSET]
	ldr	r2, [r3, #MCATT0_OFFSET]

	/* MCATT1: Card Interface Attribute Space Timing, slot 1            */
	ldr     r2, =CFG_MCATT1_VAL
	str     r2, [r3, #MCATT1_OFFSET]
	ldr	r2, [r3, #MCATT1_OFFSET]

	/* MCIO0: Card Interface I/O Space Timing, slot 0                   */
	ldr     r2, =CFG_MCIO0_VAL
	str     r2, [r3, #MCIO0_OFFSET]
	ldr	r2, [r3, #MCIO0_OFFSET]

	/* MCIO1: Card Interface I/O Space Timing, slot 1                   */
	ldr     r2, =CFG_MCIO1_VAL
	str     r2, [r3, #MCIO1_OFFSET]
	ldr	r2, [r3, #MCIO1_OFFSET]

	/* FLYCNFG (skip on gumstix) */

	/* Reset the system appropriately. Configure, but do not enable, each SDRAM partition pair 
	 * by clearing the enable bits MDCNFG[DEx] when writing to the MDCNFG register. */

	ldr	r0, =MDCNFG_VAL_13_10	/* Load the value for MDCNFG */

	ldr	r3, =MDCNFG		/* Load the SDRAM Configuration register. Must not be enabled yet. */
	str	r0, [r3]		/* Write to MDCNFG register */
	ldr	r0, [r3]

	/* Set MDREFR[K0RUN]. Properly configure MDREFR[K0DB2] and MDREFR[K0DB4]. 
	 * Retain the current values of MDREFR[APD] (clear) and MDREFR[SLFRSH] (set). 
	 * MDREFR[DRI] must contain a valid value (not all 0s). If required, MDREFR[KxFREE] 
	 * can be de-asserted. */

	ldr	r3, =MDREFR
	ldr	r2, [r3]		/* read MDREFR value */

	ldr	r1, =0xfff
	bic	r2, r2, r1
	orr	r2, r2, #0x001		/* configure a valid SDRAM Refresh Interval (DRI) */

	/* SDCLK0 n/c *
	/* SDCLK1 goes to SDRAM */
	bic	r2, r2, #(MDREFR_K2FREE | MDREFR_K1FREE | MDREFR_K0FREE)	/* Clear free run */
	str	r2, [r3]

	/* In systems that contain synchronous flash memory, write to the SXCNFG to configure all 
	 * appropriate bits, including the enables. While the synchronous flash banks are being 
	 * configured, the SDRAM banks must be disabled and MDREFR[APD] must be de-asserted 
	 * (auto-power-down disabled). (skip on gumstix)*/

	/*  In systems that contain SDRAM, toggle the SDRAM controller through the following state 
	 * sequence: self-refresh and clock-stop to self-refresh to power-down to PWRDNX to NOP. */
	orr	r2, r2, #(MDREFR_K1RUN|MDREFR_K2RUN)	/* assert K1RUN (and K2RUN) */
	bic	r2, r2, #(MDREFR_K1DB2|MDREFR_K2DB2)	/* clear K1DB2 (and K2DB2) */
	str	r2, [r3]		/* change from "self-refresh and clock-stop" to "self-refresh" state */

	bic	r2, r2, #MDREFR_SLFRSH	/* clear SLFRSH bit field */
	str	r2, [r3]		/* change from "self-refresh" to "Power-down" state */

	orr	r2, r2, #MDREFR_E1PIN	/* set the E1PIN bit field */
	str	r2, [r3]		/* change from "Power-down" to "PWRDNX" state */

	nop	/* no action is required to change from "PWRDNX" to "NOP" state */

	/* Appropriately configure, but do not enable, each SDRAM partition pair. 
	 * SDRAM partitions are disabled by keeping the MDCNFG[DEx] bits clear.
	 * (note: already done, but manuall repeats the instruction, so what do I know?) */
	ldr	r3, =MDCNFG		/* Load the SDRAM Configuration register. Must not be enabled yet. */
	str	r0, [r3]		/* Write to MDCNFG register */
	ldr	r0, [r3]

	/* For systems that contain SDRAM, wait the NOP power-up waiting period required by the 
	 * SDRAMs (normally 100-200 μsec) to ensure the SDRAMs receive a stable clock with a NOP 
	 * condition. */

	ldr	r3, =OSCR		/* reset the OS Timer Count to zero */
	mov	r2, #0 
	str	r2, [r3] 
	ldr	r4, =0x300		/* really 0x28a is about 200usec, so 0x300 should be plenty */
	ldr	r2, [r3] 
	cmp	r4, r2
	bgt	20b 

	/* Ensure the XScale core memory-management data cache (Coprocessor 15, Register 1, bit 2) is 
	 * disabled. If this bit is enabled, the refreshes triggered by the next step may not be passed 
	 * properly through to the memory controller. Coprocessor 15, register 1, bit 2 must be re- 
	 * enabled after the refreshes are performed if data cache is preferred. */

	mrc	p15, 0, r0, c1, c0, 0	/* Read the register */
	bic	r0, #0x4		/* turn data cache off */
	mcr	p15, 0, r0, c1, c0, 0

	CPWAIT				/* wait for co-processor */

	/* On hardware reset in systems that contain SDRAM, trigger a number (the number required by 
	 * the SDRAM manufacturer) of refresh cycles by attempting non-burst read or write accesses to
	 * any disabled SDRAM bank. Each such access causes a simultaneous CBR for all four banks, 
	 * which in turn causes a pass through the CBR state and a return to NOP. On the first pass, the 
	 * PALL state is incurred before the CBR state. */

	ldr	r3, =CFG_DRAM_BASE
	mov	r2, #2	/* now must do 2 or more refresh or CBR commands before the first access */
	str	r2, [r3]
	subs	r2, r2, #1 
	bne	CBR_refresh1 

	/* Can re-enable DCACHE if it was disabled above (skip on gumstix) */

	/* In systems that contain SDRAM, enable SDRAM partitions by setting MDCNFG[DEx] bits. */

	ldr	r3, =MDCNFG		/* sdram config -- sdram enable */
	ldr	r2, [r3] 
	orr	r2, r2, #(MDCNFG_DE0|MDCNFG_DE1)	/* enable partitions 0,1 */
	orr	r2, r2, #(MDCNFG_DE2|MDCNFG_DE3)	/* enable partitions 2,3 */
	str	r2, [r3]		/* write to MDCNFG */

	/*  In systems that contain SDRAM, write the MDMRS register to trigger an MRS command to 
	 * all enabled banks of SDRAM. For each SDRAM partition pair that has one or both partitions 
	 * enabled, this forces a pass through the MRS state and a return to NOP. The CAS latency is the 
	 * only variable option and is derived from what was programmed into the MDCNFG[MDTC0] 
	 * and MDCNFG[MDTC2] fields. The burst type and length are always programmed to 
	 * sequential and four, respectively. */

	ldr	r3, =MDMRS		/* write the MDMRS */
	ldr	r2, =0x00320032		/* CAS latency = 3 */
	str	r2, [r3]

	/*  In systems that contain SDRAM or synchronous flash, optionally enable auto-power-down by 
	 * setting MDREFR[APD]. */

	ldr	r3, =MDREFR		/* enable auto-power-down */
	ldr	r2, [r3] 
	orr	r2, r2, #MDREFR_APD	/* set the APD bit */
	str	r2, [r3]		/* write to MDREFR */

	/* Now we check to make sure we got the column/row addressing right by checking for mirrors
	 * in low SDRAM.  If we find a mirror, then the location of the mirror will clue us to
	 * what the alignment should actually be.  We start off configuring for 13x10 then scale down */
	mov	r0, #0x00000800
	mov	r1, #CFG_DRAM_BASE
	str	r1, [r1]		/* Write the address to base of RAM */
	str	r0, [r1, r0]		/* Write the offset to the location */
	ldr	r1, [r1]		/* Read back base of RAM */
	cmp	r0, r1			/* See if we found a mirror */
	bne	end_of_memsetup

	/* If we get here, we found a mirror, so restart RAM with different settings and try again */
	ldr	r0, =MDCNFG_VAL_13_9
	b	memory_timing_setup


Then we initialize GPIO pins, set the core clock frequency, and configure the memory controller. The exception vectors for undefined instruction, software interrupt, prefetch abort, data abort, irq, fiq and reset are set.

        ldr     r0, _MALLOC_DRAM_BASE   /* malloc area top in SDRAM         */
        sub     r0, r0, #CFG_MALLOC_LEN /* malloc area                      */
        sub     r0, r0, #CFG_GBL_DATA_SIZE /* bdinfo                        */
        sub     sp, r0, #12             /* leave 3 words for abort-stack    */

        ldr     r0, _bss_start          /* find start of bss segment        */
        ldr     r1, _bss_end            /* stop here                        */
        mov     r2, #0x00000000         /* clear                            */

clbss_l:str     r2, [r0]                /* clear loop...                    */
        add     r0, r0, #4
        cmp     r0, r1
        ble     clbss_l

        ldr     pc, _start_armboot

As we set up the memory controller, this means that we now have a usable RAM. At this point U-Boot initializes and sets up a stack at the RAM and copies itself there.

This also means that, we now have established a C runtime environment, and that we're rid of all the assembler mess and can progress using C. Next function that is called is a function called start armboot. This function is a generic function in the U-Boot framework. What it does is call your board specific functions, through the help of a struct init_fnc_ t. These board specific functions are:

  • cpu_init(cpu/pxa/cpu.c): Initializes irq and stacks
  • board_init(board/gumstix/gumstix.c): Sets the board serial, and the address where the board expects boot parameters.
  • env_init: Reads u-boot configuration from the FLASH.
  • init_baudrate(lib arm/board.c): initialize the baudrate to the setting given in u-boot configuration
  • serial_init(cpu/pxa/serial.c): initialize the serial port so that we can establish a healthy serial connection

Afterwards, console is initialized and the U-Boot banner is displayed.

The Linux Kernel

The Linux kernel is a monolithic kernel, consisting of approximately 20000 files including the kernel subsystems(memory manager, scheduler, filesystems, etc...) and drivers for zillions of devices. Currently, Linux is the leader on the number of CPU architectures supported. To see the supported architectures you can consult the arch dir under the Linux kernel source tree.

The source code is hosted at At first, it might seem like an impossible task to get a grasp of the source tree. Using the right tools, reading a few books(see App. A: References), and spending some time reading the source code can help a developer to get on track. The source tree consists of:

  • arch: architecture specific code
  • Documentation
  • ipc: Kernel ipc structures(semaphores, shared memory, etc)
  • drivers: contains driver and subsystem(usb, i2c, ...) modules.
  • fs: filesystem code
  • mm: memory management code
  • scripts: common scripts used for the build
  • include: header files
  • lib: common data structures, crc related code
  • crypto
  • init: initialization code
  • net: networking protocols
  • sound: sound subsystems
  • security

The Kernel Proper

Before going further on how a kernel is built, it's more appropriate to examine the output of the build. When you successfully compile a Linux Kernel, you get a file named vmlinux as a result. That file is a stand-alone statically linked ELF file which contains all the kernel code(and drivers, if you have chosen them to be built statically instead of modules). Instead of using the Kernel Build System, you could've compiled each of the kernel source files by hand then link them using ld by hand to obtain an equivalent of the vmlinux binary.

Now, to cross-compile a kernel, configure your kernel by issuing:

make ARCH=arm [menuconfig|config|xconfig]

make ARCH=arm CROSSCOMPILE=arm-linux-

Root File System

Root file system is the '/' directory that the kernel will bind at system startup. It contains all the post-bootload software. It contains the binary versions of GNU utilities, dev directory for the device nodes, etc directory for the configuration files, dynamic libraries, and the Linux kernel.

Buildroot: Cross-development Toolchain and Root File System Building Made Easy Buildroot is a system that consists of Makefiles and patches to ease the building of a rootfs and a cross-development toolchain. It lets you build the cross-toolchain including binutils, gcc, a C library. You can also build a root file system that includes busybox customized to your own requirements.


Busybox is a tool that contains stripped down versions of the mostly used Unix tools that are widely used on Linux based embedded systems. It is very easy to configure and compile. Has a footprint of approximately 450k when compiled with the uClibc library and the common tools. It includes:

  • Standard UNIX tools: ls, cp, cat, rm, chmod , ...
  • Networking tools: traceroute, nslookup, ifconfig, httpd, ...
  • Search tools: find, grep, ...
  • Archival tools: rpm, dpkg, bunzip2, ...
  • ... editors, file-system tools and more.

The package can be obtained from

The configuration for the busybox is similar to the Linux kernel configuration as they both use an ncurses based configuration screen. To configure busybox you simply type:

make menuconfig make

You can set your desired compiler to use from Busybox settings -> Build Options. After completing the configuration you simply issue the 'make' command from the shell and , there it is! A binary named busybox is created. This single binary contains all the tools that have been selected in the configuration screen. If you've chosen an installation prefix you can see soft links created with the utility names, which are all linked to the busybox binary.

Dissecting KOBOT's Eyes


The motherboard used in KOBOT is a Gumstix Waysmall board with:

  • an Intel PXA255 processor clocked at 200 MHz
  • 64 MBs of RAM
  • 4 MB Flash
  • 2 serial ports and a USB client with a mini-b socket

Connected to that motherboard is a daughterboard designed by Mehmet Durna, which includes an OmniVision 6620 video sensor and an Averlogic frame buffer. Both of the peripherals are connected to the PXA255 through GPIO pins.

Inside the Flash

The default rootfs in the gumstix board comes with:

  • Busybox
  • module-init-tools
  • uClibc
  • Boa webserver
  • Dropbear ssh server

Camera Driver

The camera driver for the KOBOT consists of two modules. One module is used for accessing the configuration registers of the camera through the SCCB(Serial Camera Control Bus), which is an equivalent bus to the i2c bus invented by Philips. The other module is used to implement the camera initialization function, interrupt handler and the read function that is used to interface the kernel data with the userspace.


i2c is a serial bus used for attaching peripherals to a microcontroller. It is 8-bit oriented, and can support transfer speeds of upto 3.4 Mbit/s in its High-speed mode. To write an i2c driver for Linux you need to have two structs that link your device and the i2c subsystem. The first one is the struct i2c driver. You give function pointers to this struct to define how it behaves. Functions for attaching a device, detaching a device, shutdown, suspend, resume and command could be bound to the struct. The next struct called struct i2c client, holds device specific data. So a struct i2c driver contains device-specific generic hooks that the device uses, and the struct i2c client holds the data for a unique device.

video sensor

The Omnivision video sensor has the following properties:

  • 352x288 resolution
  • 60fps
  • YCrCb 4:2:2, GRB 4:2:2, RGB Raw Data Output
  • SCCB programmable
Figure: The driver flowchart

When the video sensor module is loaded, it first initializes the frame buffer that resides on the daughterboard. Then the irq handler function for the video sensor's FODD signal. the irq handler function does is asserting the write enable signal on the framebuffer when a transition in FODD happens from low to high so that the camera can write to it, and when the FODD transition goes from high to low it disables it. When a read request from userspace is made, the function that corresponds to the read system call reads the image data to a buffer that resides in the kernel and passes it to the userspace.

Figure: Video port timing
This page was last modified on 28 December 2009, at 15:26. This page has been accessed 3,916 times.