Home »


Project Organisation


NanoStreams explores an application-specific heterogeneous analytics-on-chip (AoC) engine to accommodate transactional and analytical processing on high-frequency data streams. The following figure  outlines the NanoStreams AoC.

The AoC includes a small number of latency-optimised RISC cores and a large number of throughput-optimised, application-specific nano-cores.

NanoStreams will prototype this architecture on the Xilinx Zynq platform and will deploy the platform as a real-time analytics accelerators of many-core micro-servers. Current Zynq platforms offer two ARM Cortex A9 cores with about 70% of the chip area devoted to reconfigurable accelerator fabric. We estimate that with 20 nm technology, which is expected to become available during the project, NanoSteams will be able to integrate up to 1000 nano-cores in reconfigurable logic.


Ιν the NanoStreams software stack, the programmer expresses analytical queries using a domain-specific database language. In the NanoStreams test bed, analytical queries are expressed in SQL and Q, the domain-specific languages employed by MonetDB and KDB+ respectively.

The query execution engines of the in-memory databases are implemented using a parallel extension of C with streaming dataflow extensions and native interfaces for data access provided by the databases.



The streaming components of workloads are implemented directly in C with dataflow extensions and real-time extensions, using the same language substrate as the analytical queries. NanoStreams introduces real-time extensions in the language runtime system, in conjunction with performance isolation of the operating and runtime system components from the transactional and analytical kernels, to achieve high analytical throughput and transactional packet processing response times that match the streaming data rates on the ingestion path.


The NanoStreams operating system manages reconfigurable accelerators as first class resources, implementing time accounting and strict priority scheduling of tasks on accelerators, as is the case with ARM cores. The system software stack is isolated from transactional and analytical kernels, by executing on dedicated RISC cores, using dedicated cache banks to minimise interference and contention. Each AoC runs a copy of the operating system and different AoC nodes communicate explicitly with RDMA, further enhancing isolation and scalability of the entire system software stack, while enabling parallelisation of analytics applications that mine and correlate data from multiple streams.


The operating system provides a single, byte-addressable, asymmetric virtual address space to each AoC node. This address space is distributed between DRAM and PCRAM. The runtime system directly accesses the asymmetric address space via a data placement interface, regular load-store instructions and RDMA. RDMA leverageσ the AoC network interface to transfer data between different types of memory or different AoC address spaces efficiently, without involving the AoC cores.


NanoStreams addresses the needs of a landscape of applications of real-time analytics on streaming “Big Data”, with respective business cases in computational finance, business intelligence and healthcare.