Due to many setbacks progress has been fairly slow at Espoo right now, despite hardware being plentifull.
As soon as we solve all these setbacks we should be able to fairly easily *double* up the quantity of nodes running.

In the meantime, we have decided to do a number of local drive nodes, a small batch, to get something new online ASAP. This is a bunch of 8G, 16Gs and 2Gs.
Those nodes loose the main benefits of our setup - but they get online ASAP with guaranteed performance.

Some of the setbacks has been on the level of ridiculous, for our new type of storage mobo + cpu combo doing our first unit we had to go through 4 CPUs, 3 Mobos and few sets of ram when finally identifying the issue: Despite the CPU being on the supported CPUs list, this is not so, as soon as there is significant CPU load the system would crash. This wasted practically more than a week of our time in regards of that!

Bahamut: Already taken ridiculous time to get online, our new model of big storage units, only to turn out i personally made a configuration mistake and we have to migrate all nodes out of it, to reconfigure the whole array and system. Duh! Further, we noticed that the SMART Serial # and on drive Serial # of the OCZ SSD drives we are using do not match - we need to in any case check those up so we are able to replace failed drives.
Oh well, we'll upgrade that a bit on the process.

The good news is we have plenty of hardware!
There is about 50 2G nodes ready to be assembled and deployed, a bunch already tested waiting for final assembly + racking, 10 or so 4G nodes waiting for assembly + testing, 10+ E350 nodes for 8G/16G, and on rack we have more than 10 unused nodes waiting for storage :)

RAM, we seriously have A LOT of it, just a shipment of 65 modules of 2G DDR2 arrived, we have some 30+ 1G DDR2 modules on shelf, about 30x 4G DDR3 modules, about 10x2G DDR3 SODIMM, 10x4G DDR3 SODIMM (For the i3s), Some 8G DDR3 SODIMMs, 20x or so 2G DDR2 SODIMMS and so forth.

Riser cables for probably 100 nodes or so, and NICs for something like  40 nodes, picoPSUs for 35 nodes + some waiting delivery.

Drives we have some 20xSAS 15k 146Gb, 10x2Tb SATA, 8x3Tb SATA waiting on shelf. Ramuh is waiting to be finalized and has 10x3Tb + SSD Cache. Bahamut was still way under utilized and can host 10 more nodes after fixes are done. Storage we have waiting to be taken into production for 50+ nodes or so.

Our custom blade design has progressed a bit as well, waiting for the first prototypes to be 3D Printed. We also designed some misc pieces to be 3D Printed for our usage. On that front also made orders for oversized 3D Printer so we can print multiple blade tray's at once etc.

Cooling upgrades are under way again, to upgrade the flow to circulate the air every 3minutes for our whole colocation room. Let's see how that goes! :)

We have been planning the future - after this huge batch of nodes we will probably settle for roughly 20-30 new nodes a month, increasing monthly at a slow pace. That's not too many, but a steady supply is a good thing.



Domingo, Novembro 24, 2013

« Voltar