So, dear Ehsminer fans, stay tuned for more good news in the upcoming days.Thank you.
ACSMA Project Manager
ACSMA Project Manager
We have been working very hard on the development of ACSMA, our ultra-optimized Scrypt mining architecture. This work is carried out at our ASIC development facility in Denver Colorado USA.
In order to test this very dense logic design several steps have been taken.
Step 1: Test of reduced number of blocs being mined in a simplified version of ACSMA.
This first step required two phases:
1) A simulation of a simplified version of ACSMA is tested thoroughly on a PC server in order to verify the functioning of the complete mining process, configuration and communications.
2) The logic Synthesis, this is the process of converting high level language used to create ACSMA into hardware primitives like gates memory, FIFOs, registers etc. This is to analyze resources requirement and FPGA verification using an ACHRONIX testbed, for demo purposes.
Step 2: Test of Full version of ACSMA with reduced number of Scrypt defined ROMMIX units (this because of memory requirements).
This second step required two phases:
3) A simulation of mining process is tested on a PC server. Once again this is to verify the full functioning of the mining process of a large scale ACSMA unit.
4) The logic Synthesis and FPGA verification is also used to analyze logic resources using a dual XILINX-ACHRONIX testbed.
This also to showcase our product.
We have been working at all these points concurrently. We can show now results of point 1 and point 2.
We have our demo of point 3 is almost ready to show pretty soon.
Once point 4 is finished and shown, ASIC preparation can start with our ASIC partner.
So we will keep you posted as things are taking shape now.
We are pleased to announce that our communications USB 2.0 high speed interface is fully working now. It was designed to be compatible with the upcoming USB 3.0 new FTDI chip.
This interface will allow a maximum of 480Mb/s delivered to the mining rig in version 2.0. around 8 times more faster in USB 3.0 version.
This is a huge number that even with OS overhead that is more efficient to the achievement of the payload delivery.
Now we are in the process of putting together the final proof of concept. A Raspberry Pi Linux system will be used as a Host miner (for the demo only). CPUminer 0.8 has been adapted to feed ACSMA and we are now in the process of testing.
We have also gained new knowledge in the Scrypt algorithm and will introduce few changes in ACSMA that will potentially make it a game changer. For reasons of tremendous competitor, we can not disclose that at this point but you should know that parallel design of the architecture is always going on.
The aim is to offer the best of our effort to our customers. As for us this is going to be our main engineering activity.
So Merry Christmas and happy new year.
This represents a lot of communication bandwidth by using a new host USB 3.0 is in the making by supplier FTDI from the UK, but it will be probably available by the mid of next year.
So we enterprise to design an interface that’ll be compatible with that new part which is also compatible with the signal level we need in our Asic chips.
For now, we have to use level shifters and this is proving to be awkward to deal with this configurations of 200 chips or more.
Finally, we are in the process of testing this new level adapters and we will hope to have this section working soon.
We are aware that a lot of people waiting for our proof of work design. Although the functional verification in simulation was done long time ago. We have incurred in several delays in the FPGA port.
Last week we discovered that we were experiencing a lot of noise in the communication interface. This came from the fact that this interface is high speed and it is supposed to support in excess of 200 asic chips. But the real problem was in the Achronix verification platform .Their development tools had a bug and we were confronted with only one option to synchronize the external high speed clock of the communication interface with the communication module inside the FPGA .This created so much noise that we started to get a corrupted data. It was only yesterday that it was clear that our communication interface has been properly designed and we can now continue our tests of the ACSMA architecture .
The ACSMA architecture has become very complex and it has been quite a challenge to devise an efficient testing mechanism. Therefore, we decided to speed up the design validation to read the results of different modules as computation progresses throughout the unit. Then, compare them to the pure RTL simulation.
For this we modified the communication interface in order to transfer big data streams to upload them to a host computer .This implies a communication test interface that uses most of the USB bandwidth.
At this stage we suffered a set back last week when the signal quality of the main communication clock source was incompatible with the test platform. We tried different approaches as suggested by Achronix support. Although this product is new; there were several inconsistencies that made us take other approaches. As the main goal is not a FPGA port of the ACSMA, but a validation of the design to be ported to an ASIC solution.
We want to clarify an important aspect. As we explained in our last post “Development update 10-17-14” that the hash rate number can be only a power of two.
For this, all customers who full payed their orders of the Wolf V1 512 – 628Mh/s and the 1 – 1.22Gh/s version before this day, their orders will be converted respectively toward the offer of 1024 Mh/s and the 2048 Mh/s version for free.
So please bear with us, as we are laying the foundation for a highly optimized technological solutions for cryptocurrencies.
The development of an ASIC is a dynamic process. it requires compromises of price, power, heat and silicon area. This decision is taken by evaluation and testing of the design in the FPGA prototyping phase.
We had to do a minor change in one of the modules of ACSMA, and this require to minimize the amount of blocks of RAM we were using, since Nfactor algorithms in ACSMA require memory sizes that must be a power of 2. However, in our architecture ACSMA , Litecoin can use some other sizes.
While the memory reduction was only 5%, it leads us to modify several things and then simulate the behavior with a true cycle simulation tool.
We are now in the process of routing the design and starting to test it in the FPGA evaluation platform with the new changes. A new logic analyzer module has been added to follow the Hashing computation in real time.
This design in a simplified form approaches now 4 millions equivalent gates and 200K lines of hardware description language code.
It is important to produce a highly optimized architecture, as this will lessen heat production and less power consumption which are considered as the key elements of the design.
Dear Miners, we wanted to give you a quick update on the progress of work. Be assertive, everything is running well. Here is the advancement of work:
The pieces work by themselves and now the whole system is going to be integrated as a whole unit for proof of concept.
The Scrypt algorithm was designed to make its implementation in computer systems very poor.
Although, it is better to optimize it in the FPGA systems, but it will not be economically feasible via FPGA to obtain high hashing numbers. The only viable solution is an ASIC.
We studied the architecture of the algorithm in order to design a very efficient system that works in a typical brute force of a sheer number of mining cores in other mining solutions. To explain the basics it is better to give some figures:
It is well established that one simple hash on a dedicated digital circuit takes around 150,000 clock cycles to execute. We can infer that the shorter time those cycles take, the higher the hashing rate will be. In other words, the circuit hashing rate can be characterized by FC/150k when FC is the clock frequency of such a circuit.
There is only two possible options to increase the hash rate:
Increase the hashing rate of the mining solution by increasing the frequency of the circuit. Or, using a higher number of cores in parallel.
Increasing the frequency is a part of our purpose. We are considering the 45nm and 22nm technologies. Besides the price factor, we must also consider the thermal issue.
Mining solution with 2000 to 2500 cores is very high, also it has been considered by other companies.
The number of cores is also another limiting factor.
Instead, we chose to look at the architectural side of the multiple mining cores solutions. We realized that in all cores and at a particular moment of time only a small fraction of the electronics was being used. Although it is much easier to do silicon copy and paste it on thousands of cores and produces an ASIC very quickly.
This approach is nowhere near optimized. As all of these units cores are not interconnected neither they help each other in any way. So billions of expensive transistors are not fully utilized!
This analysis was the basis guidelines for the design of ACSMA. To start giving some numbers on the power of ASCMA, we could go back to the original formula for the hashing rate fc/150k.
If we had the possibility of integrating a huge number of hashing cores, let say 150k cores .The hashing rate of one circuit will be equal to FC.
For a 1 GHz circuit frequency we could obtain 1GHash/S, but this is obviously not possible with the current state of technology. We will need in excess of 50 ASIC chips each with 2500 mining cores with the standard solution.
The architecture of ACSMA addresses the optimization of electronics usage .If the circuit contains 1 billion transistors and all of them are used at the same time ; this optimization is directed to decrease the silicon size of the ACSMA and die ; Allowing for more equivalent units to increase the size of the embedded high speed rams.
Early in the design of the Litecoin miner we discovered that Nfactor mining was easily possible to add it with few changes. As a matter of fact, ACSMA could also process Bitcoin. This is one of the reasons for the name configurable in ACSMA.
ACSMA has been designed to increase the Hash rate by using an architecture organization. The aim was to produce a HASH per clock cycle .This is not far fetch as it seems .Let’s take the example of the 150k cores architecture. We could run that circuit and produce each 150k cycles a total of 150KHashes (as they are parallel). This is equivalent to 1Hash per cycle but there is a drawback.
There is a latency of 150kcycles associated, we looked into this and tried to minimize that latency. Although ACSMA cannot escape the curse of the SCRYPT algorithm memory requirements by optimizing the silicon die area to include more memory – it is now possible. ACSMA chips will contain between 2 Gigabits and 4 Gigabits.
ACSMA chips have specialized modules that deal with the memory blocks .The computational power of ACSMA is dependent of the total internal memory given by:
Nu = number of memory units (a minimum of 50 in each chip)
M = Contiguous 1megabit memory blocks = 64
Nram = Nu*M: 3.2gigabits of total memory
The final value of Nram will be defined in the next design phase, but a minimum of 3.2gibits is part of the design requirements with the ASIC contractor.
Now, our Scrypt implementation can write each 62Megabit block in 64K cycles. This defines the Hashing power of ACSMA and it means that one unit inside ACSMA is able to produce an equivalent HASH a little over 1000 cycles. The total hashing capability needs to be multiplied by Nu the number of memory units.
Fc = 600 MHz
The formula for determining the hashing power for LiteCoin N = 10 – With basic values :
Hlcpwr = Nu*(fc/1000) (or 30 Mh/s with basic requirements).
With this setting, 1GH mining rig will require 30 chips or so. Our current API addressing scheme with allows using 256 chips.
Nfactor mining is based on huge memory requirements depending on the value of N
Because memory inside an ASIC is a fixed resource, all we can do is use more memory for each hash.
This will reduce mining hash rate of ACSMA Hnfac depending on the value of N, and it can be stated in terms of Hlcpwras follows:
Hnfac= Hlcpwr/[2^(Nfac-10) ]
Which basically means by using the same memory amount inside ACSMA chips, hashing power is halved every time Nfac is increased ?
Now this numbers are just used as requirements for our basic design .The frequency can be increased, and with the reduction of silicon area with which ACSMA was designed in mind. We will increase the memory blocks and therefore the total hashing power.
We love crypto
We are still perfectly on track to release the prototype in less than 25 days from now. It is quite a monumental task to set up a powerful miner which is fast and reliable.We have already passed the biggest hurdles and are well on our way to complete the final tasks within our set time frame.
Bellow the details of our development :
We studied several approaches to develop a powerful and configurable architecture capable of mining different new SCRYPT oriented virtual currencies.
Development started by structuring the architecture in high level language or HLS conversion to obtain RTL code. We obtained for purposes of FPGA prototyping 200K lines of RTL code that were tested in Functional Analysis. This phase was tested with a real cycle simulator.
After Functional simulation, now we have an idea of resources required to prototype on the FPGA platform .The Scrypt algorithm requires a lot of memory. In a few words the trick is “the faster you can run the algorithm ;the faster you produce data ; more memories you need”.
To test architecture like ACSMA we needed a FPGA system with a lot of embedded blocks of memory .In our case 80Mbits were the minimum required and only the Achronix FPGA was fast enough and had that big memory.
The next processing phases are physically porting all the RTL code into FPGA logic and memories.
After logic synthesis we start to see our developing architecture in terms of logic primitives.
The Mapping phase is where all is optimized and connected. Now we have converted all language code into FPGA resources and we need to verify that the conversion process is right and the behavior is equivalent to the functional code we started with.
After Synthesis and mapping is done. The last and more time consuming phase starts:
This phase is iterative as it is very computationally intensive, and demands are very limited to the human intervention.
Only directives are given at the start of the process of routing to guide the routing algorithm.
After this phase, everything becomes easy – we’ll just do the FPGA/ASIC conversion. The FPGA based design can be converted into an ASIC, which can then be used as a drop-in replacement and this process takes about for working months. Production quantities is available in four weeks after conversion approval.
First of all, thank you for your patience. We appreciate it since, it has been a while that we updated you with our news.
We are happy to announce you that we are in our final stage of development.
We love crypto