Solo3 (Storage): All components swapped, but still giving CANNOT IDENTIFY DRIVE errors -> PSU probably broken, will be swapped today and see if we can get this to production.

Solo4 (Storage): Getting built in the nearby days, this is a new config which might require extra days to test. This will have 10x3Tb Drives and a Samsung 840 PRO 256Gb SSD Cache drive.

Solo1 (Storage): SSD caching enabled in write around mode -> seems to give roughly 1 magnetic drive worth of performance boost when utilizing eio. Not happy with the performance increase. Planning on tiered storage for future.

Network infra:
2xExtreme Networks Summit200-48 switches purchased for 100Mbps.
2xCisco Catalyst 48port + 2xGbic switches acquired for 100Mbps, and the according Gbic modules purchased.
Still got 1xExtreme Networks Summit200-48 waiting to be deployed.
1xBrocade switch arrived at mid warehouse for the SAN infra (48x1Gbps + 4x10GbE)

Contemplating on acquiring a Dell Powerconnect 6224F -> 24xSFP+ 10Gb ports, and same series 48xGbit + 2xSFP+ switches for december delivery from the states.

Purchased a lot of network cables, over 600euros worth, now we've spent some 900 euros on cables alone! :O On top of that all the accessories for cable management.

3xCore i3 is waiting to get into production. 4xAMD E350 is waiting to enter into production. 4xAtom D410PT waiting to enter into production. Misc cage of 2 & 4G Atoms (D510, D525, n2800), 8 units, racked and waiting to enter production.

Still waiting for racking & wiring is a cage of 8xAtom D510/D525 with 2G each, next cage of 8 Atoms is waiting for misc accessories to be built.

We are starting to run out of ports on internet access side -> adding one of the Summits during weekend.

Shipping to warranty 3x3Tb Drives + 2xCore i3s today. Waiting from warranty 3x3Tb drives. Suspecting 2 more drives waiting for rechecking before sending to warranty. If they indeed are failed, that is already 8 drives sent to warranty (yikes!)

Waiting from the states:
1xBrocade switch, few SSD drives, about 20 nodes, lots of ram, couple highend seasonic PSUs, Adaptec highend SAS/SATA Controllers (16port each!) etc.
One vendor today shipped a batch of 10xAtom D2500, 10x92mm cooling fans, 3x120mm fans and 6xPSUs.

Currently on stock about 20 more pico PSUs, 40 NICs (each node needs 2), 10+ 3Tb drives, few 2Tb drives (spares), and a bunch of 1Tb drives (will probably enter produciton ever), 18x120mm fans, some 30x80mm Papst fans etc etc. Today receiving a shipment of correct plugins etc. accessories for assembly.

Contemplating on a purchase of a lot of 16x AMD E350s. There is cheaper models available as well, but they are out of stock right now, so considering these more expensive ones.

End of the month we should receive first 20bay hotswap chassis for review, anxious to receive it. Contemplating tiered storage for it. To do that however, i think we need to intentionally wear some SSDs first so that their failure times will not be close to each other.

We are also wondering how far will the cooling last for us, but fortunately, during winter is scheduled upgrade for secondary cooling unit. Our current cooling unit is just 9.4kW cooling capacity, but since it's the only one we do not dare to load it much beyond 6kW, after adding the second one our capacity should be at around 15kW or more. It does ease however that we run the DC "hot", at just shy 30c ambient (fluctuates 28-29). It's nowhere as hot as Google does however, and our metric is disk temperature -> keeping them below 40c is what we target.

Lots of software work remains to be done, most of the monitoring, new distro templates, better identification of physical units and better management of them, automatic per node monitoring + reboot (along with the hardware!).

Storage is stable and performing. We are seeing daily higher peak BW usage.

 

Immediate backlog is about 30 nodes, and 10 migration nodes.



Thursday, August 15, 2013

<< Geri