Seq2SeqSharp

Seq2SeqSharp
Developer(s)	Zhongkai Fu
Repository	https://github.com/zhongkaifu/Seq2SeqSharp
Engine
Type	Library for deep learning
License	BSD 3
Website	github.com/zhongkaifu/Seq2SeqSharp

Search Seq2SeqSharp on Amazon.

Seq2SeqSharp^[2] is a tensor based fast & flexible encoder-decoder deep neural network framework written by .NET (C#). It has many highlighted features, such as automatic differentiation, many different types of encoders/decoders(Transformer, LSTM, BiLSTM and so on), multi-GPUs supported and so on.

History[edit]

It is hosted at GitHub.^[3]

Features[edit]

Pure C# framework
Deep bi-directional LSTM encoder
Deep attention based LSTM decoder
Transformer encoder
Graph based neural network
Automatic differentiation
Tensor based operations
Running on both CPU and GPU (CUDA)
Support multi-GPUs
Mini-batch
Dropout
RMSProp optmization
Embedding & Pre-trained model
Auto data shuffling
Auto vocabulary building
Beam search decoder
Visualize neural network

How it works[edit]

Benefit from automatic differentiation, tensor based compute graph and built-in operations, neural network can get built by a few code, and the framework will automatically build the corresponding backward part for you, and make the network could run on multi-GPUs or CPUs. Here is an example about attentioned based LSTM cells in C# code.

       /// <summary>
       /// Update LSTM-Attention cells according to given weights
       /// </summary>
       /// <param name="context">The context weights for attention</param>
       /// <param name="input">The input weights</param>
       /// <param name="computeGraph">The compute graph to build workflow</param>
       /// <returns>Update hidden weights</returns>
       public IWeightTensor Step(IWeightTensor context, IWeightTensor input, IComputeGraph g)
       {
           var computeGraph = g.CreateSubGraph(m_name);

           var cell_prev = Cell;
           var hidden_prev = Hidden;

           var hxhc = computeGraph.ConcatColumns(input, hidden_prev, context);
           var hhSum = computeGraph.Affine(hxhc, m_Wxhc, m_b);
           var hhSum2 = layerNorm1.Process(hhSum, computeGraph);

           (var gates_raw, var cell_write_raw) = computeGraph.SplitColumns(hhSum2, m_hdim * 3, m_hdim);
           var gates = computeGraph.Sigmoid(gates_raw);
           var cell_write = computeGraph.Tanh(cell_write_raw);

           (var input_gate, var forget_gate, var output_gate) = computeGraph.SplitColumns(gates, m_hdim, m_hdim, m_hdim);

           // compute new cell activation: ct = forget_gate * cell_prev + input_gate * cell_write
           Cell = computeGraph.EltMulMulAdd(forget_gate, cell_prev, input_gate, cell_write);
           var ct2 = layerNorm2.Process(Cell, computeGraph);

           Hidden = computeGraph.EltMul(output_gate, computeGraph.Tanh(ct2));

           return Hidden;
       }

Another example about scaled multi-heads attention component which is the core part in Transformer model written by C#.

       /// <summary>
       /// Scaled multi-heads attention component with skip connectioned feed forward layers
       /// </summary>
       /// <param name="input">The input tensor</param>
       /// <param name="g">The instance of computing graph</param>
       /// <returns></returns>
       public IWeightTensor Perform(IWeightTensor input, IComputeGraph graph)
       {
           IComputeGraph g = graph.CreateSubGraph(m_name);

           var seqLen = input.Rows / m_batchSize;

           //Input projections
           var allQ = g.View(Q.Process(input, g), m_batchSize, seqLen, m_multiHeadNum, m_d);
           var allK = g.View(K.Process(input, g), m_batchSize, seqLen, m_multiHeadNum, m_d);
           var allV = g.View(V.Process(input, g), m_batchSize, seqLen, m_multiHeadNum, m_d);

           //Multi-head attentions
           var Qs = g.View(g.Permute(allQ, 2, 0, 1, 3), m_multiHeadNum * m_batchSize, seqLen, m_d);
           var Ks = g.View(g.Permute(allK, 2, 0, 3, 1), m_multiHeadNum * m_batchSize, m_d, seqLen);
           var Vs = g.View(g.Permute(allV, 2, 0, 1, 3), m_multiHeadNum * m_batchSize, seqLen, m_d);

           // Scaled softmax
           float scale = 1.0f / (float)Math.Sqrt(m_d);
           var attn = g.MulBatch(Qs, Ks, m_multiHeadNum * m_batchSize, scale);
           var attn2 = g.View(attn, m_multiHeadNum * m_batchSize * seqLen, seqLen);

           var softmax = g.Softmax(attn2);
           var softmax2 = g.View(softmax, m_multiHeadNum * m_batchSize, seqLen, seqLen);
           var o = g.View(g.MulBatch(softmax2, Vs, m_multiHeadNum * m_batchSize), m_multiHeadNum, m_batchSize, seqLen, m_d);
           var W = g.View(g.Permute(o, 1, 2, 0, 3), m_batchSize * seqLen, m_multiHeadNum * m_d);

           // Output projection
           var finalAttResults = g.Affine(W, W0, b0);

           //Skip connection and layer normaliztion
           var addedAttResult = g.Add(finalAttResults, input);
           var normAddedAttResult = layerNorm1.Process(addedAttResult, g);

           //Feed forward
           var ffnResult = feedForwardLayer1.Process(normAddedAttResult, g);
           var reluFFNResult = g.Relu(ffnResult);
           var ffn2Result = feedForwardLayer2.Process(reluFFNResult, g);

           //Skip connection and layer normaliztion
           var addFFNResult = g.Add(ffn2Result, normAddedAttResult);
           var normAddFFNResult = layerNorm2.Process(addFFNResult, g);

           return normAddFFNResult;
       }

```

References[edit]

↑ "Seq2SeqSharp LICENSE". GitHub.
↑ "Seq2SeqSharp Project". https://github.com/zhongkaifu/Seq2SeqSharp. Retrieved 2019-10-10. External link in |website= (help)
↑ "Seq2SeqSharp: a tensor based fast and flexible encoder-decoder deep neural network framework written by .NET (C#)". GitHub.

External links[edit]

Modify the page according to the feedback from reviewer[edit]

This article "Seq2SeqSharp" is from Wikipedia. The list of its authors can be seen in its historical and/or the page Edithistory:Seq2SeqSharp. Articles copied from Draft Namespace on Wikipedia could be seen on the Draft Namespace of Wikipedia and not main one.

[1] "Seq2SeqSharp LICENSE". GitHub.

[2] "Seq2SeqSharp Project". https://github.com/zhongkaifu/Seq2SeqSharp. Retrieved 2019-10-10. External link in |website= (help)

[3] "Seq2SeqSharp: a tensor based fast and flexible encoder-decoder deep neural network framework written by .NET (C#)". GitHub.

[1]

[2]

[3]

v t e Deep learning software
Open-source	Apache SINGA Caffe Deeplearning4j Dlib Keras Microsoft Cognitive Toolkit MXNet OpenNN PyTorch TensorFlow Theano Torch ONNX
Proprietary	Maple Neural Designer Wolfram Mathematica Apple Core ML
Category Comparison

Seq2SeqSharp

Contents