Quasi-systolic processor and quasi-systolic array

Inventors

Hoskins, Brian DouglasStiles, Mark DavidDaniels, Matthew WilliamMadhavan, AdvaitAdam, Gina Cristina

Assignees

United States Department of Commerce

Publication Number

US-11651231-B2

Publication Date

2023-05-16

Expiration Date

2040-03-02

Interested in licensing this patent?

MTEC can help explore whether this patent might be available for licensing for your application.


Abstract

A quasi-systolic array includes: a primary quasi-systolic processor; an edge row bank and edge column bank of edge quasi-systolic processors; and an interior bank of interior quasi-systolic processors. The primary quasi-systolic processor, edge quasi-systolic processor, and interior quasi-systolic processor independently include a quasi-systolic processor and are disposed and electrically connected in rows and columns in the quasi-systolic array.

Core Innovation

The invention disclosed is a quasi-systolic processor and quasi-systolic array designed to perform low rank matrix decompositions efficiently for hardware-accelerated machine learning systems. The quasi-systolic processor includes multiple forward and backward input and output transmission lines, primary processors that linearly transform data through rotations based on stored phase angles, and identity processors for odd numbers of transmission lines. These components are arranged in rows and columns and communicate electrically, magnetically, mechanically, or photonicly to perform transformations in a quasi-systolic manner.

The quasi-systolic array comprises a primary quasi-systolic processor, an edge row bank, an edge column bank of edge quasi-systolic processors, and an interior bank of interior quasi-systolic processors. The array is structured so that the primary quasi-systolic processor receives and initially processes the forward data, which is then propagated and dimensionally reduced through the edge banks and processed in the interior bank. Backward propagation of data occurs similarly, concluding at the primary processor to perform streaming eigen-updates efficiently.

The problem being solved relates to the inefficiencies of conventional systolic processors and methods when training hardware neuromorphic networks, specifically the high memory overhead and computational cost associated with full rank update matrix calculations. Conventional architectures require large update matrices with dimensions equal to the main parameter arrays, leading to increased area, time, and energy consumption during training. The quasi-systolic array addresses these by performing low rank matrix approximations and streaming principal component analysis to significantly reduce memory and computational requirements, enabling faster, more accurate, and more energy-efficient training of neuromorphic systems.

Claims Coverage

The patent claims include three independent claims covering the quasi-systolic processor, the quasi-systolic array, and a process for performing streaming eigen-updates with the array. The inventive features comprehensively describe structural and functional components, their interconnections, and a method leveraging these elements for efficient neuromorphic network training.

Quasi-systolic processor with paired transmission lines and rotational linear transforms

The processor has a plurality of forward and backward input and output transmission lines each numbering s, a set of f primary processors where f is the floor of s over 2. Each primary processor connects to pairs of transmission lines for input and output in both forward and backward directions. Each primary processor performs linear transformations by rotations based on a stored phase angle using forward and backward linear transform processors. The processor also includes phase angle memory and phase angle accumulation memory components updating phase angles in response to counter signals. For odd s, identity processors provide forwarding functionality without transformation.

Quasi-systolic array arrangement with distinct processor banks and data propagation

The array includes a primary quasi-systolic processor, edge row and column banks each containing multiple edge quasi-systolic processors, and an interior bank comprising interior quasi-systolic processors. Each of these processors independently comprises the quasi-systolic processor as described. They are arranged in rows and columns so that the primary processor and edge row bank occupy the first row, and the primary processor and edge column bank occupy the first column. Data flows such that the primary processor initially receives forward datum and produces output before others in the array. At least half of the primary processor's forward outputs connect respectively to single edge processors in the edge row and column banks. Edge banks connect processors in series with forward input transmission lines halving sequentially, and the interior bank processes received data forward and then backward propagates backward datum through itself and edge banks back to the primary processor.

Process for streaming eigen-updates in hardware neuromorphic networks using the quasi-systolic array

The process involves the primary quasi-systolic processor receiving two forward data inputs and producing two forward outputs. The edge row and column banks receive these outputs and iteratively transform and reduce their dimensionality to produce third and fourth forward data for the interior bank processors. The interior processors produce first and second backward data which propagate backward through the interior and edge banks. The edge banks further transform the backward data to produce fifth and sixth backward data for the primary processor, which transforms these to produce final backward data enabling streaming eigen-updates in the neuromorphic network.

The claims comprehensively define a system comprising structurally interconnected processors with paired transmission lines and phase-controlled linear transformations, assembled into a quasi-systolic array with hierarchical banks and staged data propagation. Further, the claims encompass a defined process exploiting this architecture to perform streaming eigen-updates for efficient neuromorphic network training.

Stated Advantages

The quasi-systolic array provides greater efficiency in training hardware neuromorphic networks in terms of area, time, and energy compared to conventional systolic processors.

It reduces the memory overhead by calculating low rank approximations of update matrices instead of full rank updates, leading to fewer memory locations needed.

The architecture achieves faster and more accurate computation with low latency and exponential acceleration of dimensionality reduction due to its binary tree structured, two-dimensional arrangement.

Streaming batch eigen-updates enable compact batch update representation requiring less memory and computational cost while providing high training fidelity.

The quasi-systolic array allows scaling to larger sizes more efficiently than conventional approaches, alleviating technical deficiencies such as slow training speed and high computational costs.

Documented Applications

Compression of batch training data in artificial neural networks.

Compression of artificial neural network training data for transmission during model synchronization in data centers or federated learning across wireless networks.

Subspace tracking of incoming radar signals from phased arrays.

Extraction of principal components for time series data.

JOIN OUR MAILING LIST

Stay Connected with MTEC

Keep up with active and upcoming solicitations, MTEC news and other valuable information.