Using a Marvell LAN card with ESXi 4
(Note: This post was initially written when ESXi 4.0 was available. As of late 2010, ESXi 4.1 has been released, and it does actually include a sky2 driver that may or may not work with various Marvell LAN chipsets. The post is still relevant (especially the comments) if your particular Marvell chipset does not work with the sky2 driver in ESXI 4.1. Also, the post is relevant if you’re interested in porting other network drivers to ESXi)
Well, after somehow getting my Marvell LAN card working with ESXi 3.5u4 (and u3) I thought I’d have a look at ESXi 4. Again I somehow got it to go. I’m not too sure how good it works, but it works well enough for me at home. If you can’t be bothered reading about me going on and on and on and on about how to compile it, then just scroll to the bottom of the post. The download for the source includes a precompiled module (NB: As per the post about ESXi 3.5, this is all about getting a 88E8053 chipset Marvell LAN working).
ESX/ESXi 4 is quite different from 3.5. The build chain is similar to 64 bit Redhat/Centos 5.2, so I ended up installing a x86_64 Centos 5.3 inside a vmware fusion machine to do my dev work. I just made sure I installed all the dev stuff. Then I downloaded the VMware-esx-public-source-4.0-162945.tar.gz from VMware’s open source page. It’s a much bigger file (590MB) than the file for 3.5. When you extract the file you end up with a lot of rpm files plus a vmkdrivers-gpl.tgz file. I did the following to extract it all on my test machine;
cd ~ mkdir vmware-oss cd vmware-oss tar xvzf ~/VMware-esx-public-source-4.0-162945.tar.gz mkdir drivers cd drivers tar xvzf ../vmkdrivers-gpl.tgz
One of the rpm files included is a kernel source rpm. I’m not exactly sure what it is relevant to ESX, but I installed it anyway for reference. I found I needed the qt-devel and gtk2-devel packages first;
cd ~ cd vmware-oss yum install qt-devel yum install gtk2-devel rpm -iv kernel-sourcecode-400.2.6.18-128.1.1.0.4.159770.x86_64.rpm
I’m pretty sure you don’t need the kernel source to build the drivers, but I kept it handy for reference anyway.
You can try doing a test build of the drivers now. This will build all the drivers built in to ESX/ESXi.
cd ~/vmware-oss/drivers ./build-vmkdrivers.sh
You’ll probably get a few warnings, but it should complete. If you do a find down the ‘bora’ directory you should see a bunch of .o files corresponding to the kernel modules (look under the ‘bora/build/scons/build’ directory).
OK, my approach was to look at the build-vmkdrivers.sh script and basically look at what was done to compile one network driver (I used the forcedeth driver as a reference) and just make a reduced script for my sky2 driver. As for the source to base the sky2 driver on, instead of using a driver from 2.4.37 like I did with the 3.5 version of the network driver, this time I ended up using the sky2 source from 2.6.26 (or the debian lenny incantation of it). I did originally use the sky2 driver from the kernel-sourcecode…2.6.18…rpm file, but on closer inspection of the tg3 driver that ESX uses, I noticed it actually comes from a 2.6.24.1 kernel (or thereabouts) … so I thought I may as well use a more modern reference source. I had a few minor hiccups trying to get it to compile, but in the end I just had the following shoved into the top of my sky2.c file;
/* Stuff for ESX compile */
#define upper_32_bits(n) ((u32)(((n) >> 16) >> 16))
#define csum_offset csum
#define bool int
#define PTR_ALIGN(p, a) ((typeof(p))ALIGN((unsigned long)(p), (a)))
u32 bitreverse(u32 x)
{
x = (x >> 16) | (x << 16);
x = (x >> 8 & 0x00ff00ff) | (x << 8 & 0xff00ff00);
x = (x >> 4 & 0x0f0f0f0f) | (x << 4 & 0xf0f0f0f0);
x = (x >> 2 & 0x33333333) | (x << 2 & 0xcccccccc);
x = (x >> 1 & 0x55555555) | (x << 1 & 0xaaaaaaaa);
return x;
}
The define’s are to remedy compilation errors, and the bitreverse is to satisfy an undefined symbol problem. Note that the undefined symbols errors end up in /var/log/messages on your ESXi 4 box now (unlike 3.5).
So in the end to compile my driver I did a;
./build-sky2.sh
You get a couple of warnings, but if you do a ‘find . -name sky2.o’ you should end up with two sky2.o files. There is a DEBUG and DASHG variable defined at the top of the build script. If you uncomment these it’ll build a lot of debug stuff into the modules.
If you get some errors, its probably because some directories are missing in the build path, so make them first;
mkdir -p bora/build/scons/build/vmkdriver-sky2.o/release/vmkernel64/SUBDIRS/vmkdrivers/src26/drivers/net/sky2 mkdir -p bora/build/scons/build/vmkdriver-sky2.o/release/vmkernel64/SUBDIRS/vmkdrivers/src26/common/
Now, again I used a USB stick with ESXi 4 (build 171294 in my case). To install it, I loopback mounted the VMware ESXi 4 iso file file, extracted the image.tgz file to a temp directory, bunzip2′d the big dd image file, then dd’d it to the whole USB stick (NB: There are some details about how to do this in linux on my other post re ESXi 3.5. See link at the top of the post)
A difference this time is that I didn’t have a simple.map file all pre-prepared to go into the oem.tgz file. I thought the easiest way to get it would be to just boot ESXi and let it fail when it tries to configure a network device, then somehow copy the simple.map file off. So I did this. ESXi 4 merrily boots and eventually you see the dreaded ‘lvmdriver failed’ message. It looks like ESXi is broken at that point, but just type the word ‘unsupported’ and you get a password prompt, and just hit ENTER to get a prompt (You might need to hit alt-f1 first before typing ‘unsupported’)
Because networking is not working, we’ll just copy the simple.map to the Hypervisor1 partition;
cp /etc/vmware/simple.map /vmfs/volumes/Hypervisor1
I just did a ‘sync’ and held in the power switch ( perhaps type ‘reboot’ if you feel like being more careful). Now get the USB stick to appear as a USB device in your development VM (your centos 5.x environment), and mount partition 5 (or the Hypervisor1 partition) off the USB drive, and you should see an oem.tgz file as well as the simple.map file. You need to make a directory structure up for the new oem.tgz file we’ll be creating;
cd ~ mkdir vmtest cd vmtest mkdir -p etc/vmware mkdir -p usr/lib/vmware/vmkmod
Copy the simple.map off the USB drive into etc/vmware directory in our tree structure. eg.
cp /mnt/simple.map ~/vmtest/etc/vmware
And edit the simple.map so that it includes the PCI ids for your Marvell card. Mine is 11ab:4362 so I added in the bolded line below, but yours could likely be different. If you’re not sure, you could boot ESXi off the USB stick again, do the ‘unsupported’ thing to get a prompt and type lspci -v
1166:0410 0000:0000 storage sata_svw.o
1166:0411 0000:0000 storage sata_svw.o
11ab:4362 0000:0000 network sky2.o
14e4:1600 0000:0000 network tg3.o
Now copy in the sky2.o file that we compiled earlier. The modules are in a different directory compared to 3.5 (NB: the compilation process produces two sky2.o files, so make sure you grab the one shown below)
cd ~/vmware-oss/drivers cp ./bora/build/scons/build/vmkdriver-sky2.o/release/vmkernel64/sky2.o ~/vmtest/usr/lib/vmware/vmkmod
Now tar it up, and copy it to the USB stick that should be still mounted;
cd ~/vmtest tar cvzf ~/oem.tgz * cp ../oem.tgz /mnt
Unmount the USB stick
umount /mnt
Now try booting again. Hopefully you should see a ‘loading sky2′ flash up early in the boot … and it should eventually get to the usual ESX status screen showing the current mgmt IP address. Basically if you don’t see the ‘lvmdriver load failed’ then there’s a good chance it’s working.
And yes here is the sky2-for-esxi4-0.01.tar.gz download. It includes the build script, the modified source, plus directory tree for creating the oem.tgz file including a pre-compiled copy of the module. If you can’t be bothered compiling, you can just extract this file, cd to the vmtest directory and create the oem.tgz file as per the earlier notes.
UPDATE: (2010/02/08) There is also now a driver for the Marvell 88E8001 LAN chipset (see the comments discussion below). This uses the skge driver, not the sky2 driver mentioned above. I don’t own a 88E8001, so thank you to samarium for helping out re the unresolved symbols. Please comment below if you’ve tried the skge driver and it works. I have a new tarball sky2-and-skge-for-esxi4-0.02.tar.gz containing both the sky2 and skge driver
UPDATE: (2010/12/24). xieliwei has a driver that should support the 88E8057 on ESXi 4.1 (as per the latest comments, it seems a bit iffy on 4.0). Anyone else who has this chipset, if you could try the driver and post more information that would be great. The 88E8057 code is here as a local copy ; sky2-1.22-for-esxi4-88e8057-r1-xieliwei.tar.gz or on mediafire; sky2-1.22-for-esxi4-88e8057-r1-xieliwei.tar.gz . Also there’s another version of the driver that will produce copious debug information (only of use if you’re having problems); http://www.mediafire.com/file/m836gce3r3b7ygi/sky2-1.22-for-esxi4-88e8057-r1-oem-debug-xieliwei.tgz
Hi there!
Firstly, a very well written blog entry.
I too have a Marvell Yukon and wanted to evaluate ESX/(i)4.
The installation isn’t proceeding beyond the unrelated complaint about not being able to load the lvmdriver.
I’m not installing from CD/DVD or USB but via PXE (obviosuly replacing the root oem.tgz file).
My particular chipset is a 88E8001, so slightly different to yours.
I’d hoped merely changing the ‘simple.map’ file to reflect my vendor/product ID would work (I changed mine from your 11ab:4362 to my 11ab:4320).
Upon reboot I noticed the Sky2.o module being loaded (it flashed by) but then presents me with the same lvmdriver loading error message (I have a 64GB SATA SSD and a 500GB SATA 2.5″ HDs present in a Shuttle SN95G5 (nForce3 chipset)).
Is it possible to use a precompiled Marvell .ko file from somewhere similar to the kernel 2.6 one here; http://www.marvell.com/drivers/driverDisplay.do?driverId=153?
If so; does this need to be compiled against a working install with all of the compile flags as per your source/build script?
Any ideas as I’m loathed to fill the only free PCI slot with an Intel e1000 (as I want to use it for a nice new TV-Tuner for a HTPC)?!
Looking forward eagerly in anticipation of a response.
Cheers,
John
PS: Don’t suppose you know if it’s also possible to dual boot ESX4/(i) and XenServer5.5 (and if so; know a good HOWTO)?
Wow, someone actually reads this!
Get into ‘unsupported’ mode (ie. alt-f1, type ‘unsupported’ and maybe a password) and cd /var/log and do a ;
less sysboot.log
and scroll until you find the messages about loading sky2. Mine just says ;
Module sky2 loaded successfully.
When trying to get the driver to work I tended to see a lot of ‘unresolved symbol’ warnings in the sysboot.log when it attempted to load sky2. Do you get any of these? A possibility with your onboard Marvell is that its referencing different parts of the driver to mine … hence the possibility of unresolved symbols.
If you run a regular linux distro on this host, do you know which network module gets loaded. Is it definitely sky2? or maybe the skge one? (maybe boot a linux ISO and get to a prompt and try lsmod |grep sk to see which one is loaded). I just had a go at getting skge to compile (using the driver from 2.6.26), and it seemed to compile ok. I have no idea if it will work, but you can grab it from here. Of course you’ll need to shove it in your oem.tgz and change the simple.map so that it references skge.o, not sky2.o. Tell me how you get on. (UPDATE: 2010/1/25. Looks like the skge driver obviously does not work. It was a ‘best guess’ compile. But it’s very difficult to diagnose what the problem is without a machine with an skge card in it. As per the comments below, if someone can post what the actual unresolved symbols are, that will be a ‘start’ in working out how to resolve it).
[...] If you want to go straight to ESXi 4.0, KernelCrash has you covered there as well. [...]
Hi,
I have Marvell 88E8001 card (11ab:4320) and can’t get it workwith ESXi 4.
Tried sky2.o driver… Driver loaded successfuly but card not recognized by ESXi.
Tried skge.o driver… and log says: “vmkload_mod: Can not load module skge: Unresolved symbol”
CentOS 5.4 Live CD loads skge module and card works fine.
There are pretty new drivers ond Marvell site but I don’t know how to compile them to work with ESXi 4.
Regards
Not too sure. Is there anything else listed in the sysboot.log?. Like if you set it up to use just the skge.o driver, is there anything else in the sysboot.log? Often it lists exactly what the unresolved symbols are. Alternatively, if you configure it to boot using just the sky2.o driver instead, what does ‘esxcfg-vmknic -l’ return (in ‘unsupported’ mode)?
Hi Kernel,
I got 3 Marvell 88E8001 (11ab:4320) at home and like to use them for EXSi 3.5 U5. I tried the SKGE.O drivers you provided, I can’t get it work.
It did tried to load the SKGE.O but a lot of unresolved symbols in the /var/log/config.log
did you complied SKGE.O for ESXi 4.0 only? If yes, can you make a version for ESXi 3.5 (U5)? I saw another topic from you is about Marvel Yukon Nic on ESXi 3.5, it seems same as someone post on vm-help.com, I did try the SKY2.O no luck, as they mentioned in the source code of the SKY2.0 driver has no 11ab:4320.
Let me know if you need any more info from me.
Thanks in advance.
Kernel,
I got more info, based on the lspci -vvv output, the NIC was driven by SKGE under SLAX linux live CD. So I believe SKGE.O will be the better bet. I found Marvel’s latest driver for 88E8001 from here:
http://extranet.marvell.com/drivers/files/Linux_10.81.6.3.zip
As I got 2 old PC without 64BIT & VT-x, I can only install ESXi3.5 on them. One of them even has 3C940 on board NIC, seems not support neither.
Looking forward to hear from you, thanks
i’ve been trying to get the latter marvell network controller working with ESXi4 also, and had same conclusions as the previous poster.
according to this debian mailing list, the driver for the 11ab:4320 would be the sk98lin.
http://lists.debian.org/debian-boot/2004/02/msg00230.html
do you know if that driver is available to compile for esxi 4?
Thanks,
~tim
As per an earlier reply, for those with an 88E8001 based card, I need more information about what is happening when esxi is booting. Like I said; Is there anything else listed in the sysboot.log?. Like if you set it up to use just the skge.o driver, is there anything else in the sysboot.log? Often it lists exactly what the unresolved symbols are (basically if someone can tell me what the actual unresolved symbols are I can have another go at compiling it with these missing symbols compiled in). Without an actual 88E8001 card I can only really guess.
ESX4i U1 (build 208167) on an ASUS P5Q-e.
$ lspci | awk ‘/AHCI|Gigabit/’
00:1f.2 SATA controller: Intel Corporation 82801JI (ICH10 Family) SATA AHCI Controller
02:00.0 Ethernet controller: Marvell Technology Group Ltd. 88E8056 PCI-E Gigabit Ethernet Controller (rev 12)
07:02.0 Ethernet controller: Marvell Technology Group Ltd. 88E8001 Gigabit Ethernet Controller (rev 14)
$ lspci -n|egrep `lspci | awk ‘/AHCI|Gigabit/{l=l s $1; s=”|”}END{print l}’`
00:1f.2 0106: 8086:3a22
02:00.0 0200: 11ab:4364 (rev 12)
07:02.0 0200: 11ab:4320 (rev 14)
ACHI driver for ICH10R supported out of the box.
sky2 driver provided by KernelCrash works in oem.tgz but with 11ab:4364 id, thanks.
skge driver failed to load. skge is what normally runs on this box under ubuntu, so not sure what previous commenter was talking about sk98lin? Maybe previous a driver for this card.
sysboot.log output from system trying to load skge:
vmkload_mod: Can not load module skge: Unresolved symbol
[2010-02-06 08:55:29 'VmkCtl' warning] Loading module skge.o failed. Exec of command ‘/sbin/vmkload_mod skge ‘ succeeded, but returned with non-zero status: 1
messages output from system trying to load skge:
Feb 6 08:55:29 vmkernel: 0:00:00:16.456 cpu2:4717)Loading module skge …
Feb 6 08:55:29 vmkernel: 0:00:00:16.456 cpu2:4717)Elf: 2320: symbols tagged as
Feb 6 08:55:29 vmkernel: 0:00:00:16.464 cpu2:4717)WARNING: Elf: 1570: Relocation of symbol failed: Unresolved symbol
Feb 6 08:55:29 vmkernel: 0:00:00:16.471 cpu2:4717)ALERT: Elf: 2518: Kernel module skge was loaded, but has no signature attached
Feb 6 08:55:29 vmkernel: 0:00:00:16.471 cpu2:4717)WARNING: Elf: 2542: Kernel based module load of skge failed: Unresolved symbol
so it looks like skb_pad is the culprit.
Hey thanks for that. I guess the skb_pad bit is off the end of the lines you pasted in. I’ve found the skb_pad routine out of the same kernel I used for the skge driver (2.6.26) and shoved it into the driver source code, and recompiled. I’ve uploaded this new test version of the skge driver. Can you tell me if it works .. or perhaps gets other symbol errors?
I’ll have a look later today.
Looking at the comment I left compared to what I pasted, I thnk the line got chopped by wordpress, because what it said on the Elf: 1570 line between symbol and failed was
lessthan skb_pad greaterthan
so I guess wordpad in interpreting it as a unknown html token.
Now skge loads, and works in so far as I can ping test successfully. Thanks.
Not sure if it will be useful to me, but since you went to the trouble of building it, I thought it would be nice to get it tested.
If I get keen, I’ll build up my own build environment, and maybe take a crack at getting the dual port PCI-E Silicon Image 3132 adapters I have working under ESXi, but too many other projects on the go for it to happen soon.
Thanks again for the NIC drivers.
Hey, thats great news. It’d be good to see whether it still works OK under some load. I might update the main post re the skge driver working now and encourage people to test it more thoroughly and comment. Thanks again.
kernel,
I’m trying to create and add drivers to a ESXi 4 install. Have any tips: http://www.vm-help.com/forum/viewtopic.php?f=12&t=2002
Thanks!
I don’t know much about that Broadcom card. If I was trying to get it to go, I’d work out whether there is an existing linux kernel driver for that Broadcom chipset, and use that rather than a linux driver off the broadcom site. The rationale is that a built-in linux kernel network driver is probably going to be more like the sky2 or skge drivers that I’ve modified OR some of the drivers that vmware themselves supply the source for (such as the forcedeth one). You can easily see the changes I made to say the sky2 linux driver by just downloading the linux kernel source for 2.6.26, pull out the sky2.c file and diff it against the one that I’ve included as a download here (or the skge.c one). I didn’t end up making many changes. Just have a go at trying to get your driver to compile. Once it compiles OK, then you might find some unresolved symbols popping up in the esxi boot logs, fix all those and see if it works.
Hi
Great site I have tested on my whitebox Esxi 4 U1 with Dlink DGE-530T (Marvel Yukon 88E8001). Anyway this particular NIC uses pci ids 1186:4b01, just added that into the simple.map and reboot. The Esxi detected successfully. I will test it further to see if I can NIC teaming since I have two of them.
Thank for the driver.
Hi again,
I just tested Lagg on my Dell Gigabit Switch with Dlink DGE-530T revb1 (Rev11) so far everything is running perfectly fine.
Thanks for testing the skge driver out. Greatly appreciated.
Tested the driver on my P5Q Deluxe motherboard, driver is working great on only one of the network adapter, need to make sure that the other network adapter has not been disabled in the bios. Will update later on. Thanks for a great effort.
all is working well, as i had the wrong id in the simple.map file. ESXi 4 recognises all the network adapters now. Thanks for the wonderful effort you put into these guys.
First, thank you for the wonderful insight. This has been helpful in ttying to compile a driver for the tulip.o based network cards. I think I have managed to resolve all warnings and compilation errors and can successfully generate a tulip.o module, but… I get a similar error to ‘samarium’ and was wondering if you have any insight.
Questions:
Does anyone have any idea how to compile a module to include signatures or symbols?
Does anyone know what a ‘ElfRelocateFile global’ is?
Darn… so close…
ASUS M2A/VM Motherboard
ahci.o for ATi RAID
r8169.o for RealTek 8168 NIC (Nov 3 2009)
aic7xxx.o for Adaptec SCSI (Nov 8 2009)
Installed Adaptec Quartet 4 port PCI NIC that uses the DEC 21142/43 chipset, which requires tulip.o module. Source for tulip.o retrieved from latest Linux kernel source linux-2.6.33.tar.
All compilation was done with CentOS 5.4 x64 installed on external USB drive for easy removal. The hope is to use the system as a ESXi Whitebox PC.
I get the following errors when trying to load tulip.o that I compiled.
– /usr/log/sysboot.log –
vmkload_mod: Can not load module tulip: Unresolved symbol
‘VmkCtl’ warning Loading module tulip.o failed. Exec of command ‘/sbin/vmkload_mod tulip ‘ succeeded, but returned with non-zero status: 1
– /usr/loc/messages –
cpu1:13645 Elf: 2320: symbols tagged as
cpu1:13645 ALERT: Elf: 2518: Kernel module tulip was loaded, but has no signature attached
cpu1:13645 WARNING: Elf:2542 Kernal based module load of tulip failed: Unresolved symbol ‘ElfRelocateFile global failed’
2F049B90 info ‘ha-eventmgr’ Event 9 : Issue detected on localhost.domain.com in ha-datacenter: Elf: 2518: Kernal module tulip was loaded, but has no signature attached
Any help or insight is more than appreciated. It would be nice to be able to compile a stable tulip.o driver since is it not yet available.
Hi grittyKitty. No I don’t know what ElfRelocateFile is in relation to your problem. Usually when I get those unresolved symbols, I search through a real linux kernel tree to see where the function is defined. ie. something like;
cd /usr/src/linux
grep -r skb_pad *
And then wade through the output trying to find the function definition. I had a look through the 2.6.26 kernel I often use as a base, and I couldn’t find ElfRelocateFile. It looks more like some kind of linker type thing, rather than a kernel function (but I could be way off).
But, anyway I had a go myself at trying to compile your driver (using the tulip stuff out of a 2.6.26 kernel). Firstly, I’m not 100% sure what module or modules get loaded by a regular linux kernel for your card (is it just de4x5, or de4x5 and tulip or something else entirely?). If you do know, please tell me. I took a guess and thought it just loads the de4x5 module for your card (which is part of the tulip stuff). There’s a lot more to this tulip driver stuff than some of the other types of network cards, so my effort is very much a ’1st attempt’. It compiles. I have a module. If you can try and see if it gets any further than yours great, or have a look at the source mods I made to see if there’s anything there that’s helpful to you and see how you go. So grab my de4x5-0.01.tar.gz and see if it helps. You’ll still need to put an entry in your simple.map. I am thinking that the simple.map line will be ‘blah blah network de4x5.o’ or similar.
One other thing. Is your card a 100Mbps card?
Wow. Impressive response. Thanx for the enthusiasm and extended effort. Seriously!
The Adaptec 4-port card is based on the DECchip 21142/43 chipset. It is PCI-based 32bit and, yes, it is 100MBps. Slightly older card, but still quite useful in Win2K3 systems. Was hoping to utilize the card instead of purchasing an expensive 4-port PCI-E gigabit card. Motherboard is an ASUS M2A-VM motherboard that utilizes an ATi SB600 RAID controller, AMD Athlon X2 64 6000+ with 8GB OCZ SLI DDR2-800.
This is the original information I grabbed when initially booting with Ubuntu to help identify the card, plus a couple notes I made to get me started.
04:04.0 Ethernet controller: Digital Equipment Corporation DECchip 21142/43 (rev 41)
Class 0200: 1011:0019 (rev41)
Chip Number: DC21142/3
Chip Description: PCI/CardBus 10/100 Mbit Ethernet Ctlr
Notes: SALVADOR
Module: tulip (http://hardware4linux.info/component/23459/)
tlp — DECchip 21x4x and clone Ethernet interfaces device driver
This is what CentOS reports…
04:04.0 Ethernet controller: Digital Equipment Corporation DECchip 21142/43 (rev 41)
Subsystem: D-Link System Inc Unknown device 1110
Flags: bus master, medium devsel, latency 64, IRQ 233
I/O ports at ac00 [size=128]
Memory at fddff000 (32-bit, non-prefetchable) [size=1K]
Expansion ROM at fdd80000 [disabled] [size=256K]
04:05.0 Ethernet controller: Digital Equipment Corporation DECchip 21142/43 (rev 41)
Subsystem: D-Link System Inc Unknown device 1110
Flags: bus master, medium devsel, latency 64, IRQ 225
I/O ports at a800 [size=128]
Memory at fddfe000 (32-bit, non-prefetchable) [size=1K]
Expansion ROM at fdc00000 [disabled] [size=256K]
04:06.0 Ethernet controller: Digital Equipment Corporation DECchip 21142/43 (rev 41)
Subsystem: D-Link System Inc Unknown device 1110
Flags: bus master, medium devsel, latency 64, IRQ 217
I/O ports at a400 [size=128]
Memory at fddfd000 (32-bit, non-prefetchable) [size=1K]
Expansion ROM at fdc40000 [disabled] [size=256K]
04:07.0 Ethernet controller: Digital Equipment Corporation DECchip 21142/43 (rev 41)
Subsystem: D-Link System Inc Unknown device 1110
Flags: bus master, medium devsel, latency 64, IRQ 58
I/O ports at a000 [size=128]
Memory at fddfc000 (32-bit, non-prefetchable) [size=1K]
Expansion ROM at fdc80000 [disabled] [size=256K]
To remove my initial compile errors and warnings I had to modify the netdevice.h file to include net_device_ops structure and a couple other changes not included with ESXi’s source, related to bit padding (noticed that you used a common.h file, I used le_byteshift.h stolen from linux-2.6.33). I have included my changes that I used to compile my tulip, http://www.asynccomputing.com/files/tulip-attempt-01.tgz.
I used linux-2.6.33.tar.bz2, and VMware-esx-public-source-4.0-208249.tar.gz for my source and followed your instructions above for setting up a CentOS 5.4 x64 system. P.S. I did have to RPM the ESXi kernel source to get anything to compile properly as you mentioned in your notes at the beginning of the thread.
Interesting. I’m running ESXi 4.0 Build 208157, yet the source is marked as Build 208249?!? Curious how your de4x5.o is 139kb and my tulip.o is 5609kb. Makes me wonder if I’m actually compiling this correctly.
I compared our tulip files and they are slightly different (minus the code adjustments we made). I tried de4x5.o and got a kernel crash, or at least I think it’s a kernel crash, i.e., big nasty purple screen with text. Let me know if you want me to type out the error information from the screen. There was no room on the disk to grab a data dump since I’m using ESXi on a USB for testing purposes until I get it right — then I’ll burn an install CD when testing is complete.
P.S. ESX Server 2.0.2 appears to work fine under CentOS 5.4 x64. Alternative solution =) … and some user’s claim that it is better on resources, but slightly slower. The debate is open on this topic. Another time.
I’ll check out what you mentioned about the ‘ElfRelocateFile’ and see what else I can find. Google prefers searching for ‘Elf Relocate File’, but doesn’t yield anything useful.
Thank you!
What fun!
Hi,
I’m having the same problem. And I have the same motherboard. And I looked it up, my PCI ids are the same, too.
But I only have a Windows PC and I would like to avoid putting too much stuff like VMs on this PC.
I couldn’t get my USB stick working:
I tried to extract the contents of the .iso image for my esxi on the stick and made it bootable with syslinux. But it didn’t work.
Then I tried to put the created oem.tgz on the CD where I burned the .iso image earlier. But There was no change. It still said, that it does not have any network driver.
But perhaps I mixed something up, and you did the oem.tgz file onto an other partition of your usb device where you extracted the dd file to?
Because I didn’t understand the part with the oem.tgz at all
What’s the easiest way to get it working? Is there a possibility to put it into the .iso image or do I have to take a different one?
Good evening,
I compiled your version of the tulip source, but opted to try and create a tulip.o module instead of the de4x5.o. There were a couple warnings, but the module was created. The tulip.o file was, again, around 5MB in size.
I rebooted with the new tulip.o module and again, received the nasty PSOD (purple screen of death). Lots of memery addresses and registers from the looks of it.
I then restored my tulip source, recompiled and rebooted and was able to return back to the same spot I originally mentioned, i.e., Unresolved symbol, module tulip.o failed, /sbin/vmkload_mod tulip suceeded, but returned with non-zero status.
I wanted to make sure I could recompile the tulip source I was working on and had not inadvertently buggered something that was causing the PSOD.
I’ll review further on the weekend when I have more time. Thanx for listening.
darkdragon, I’ve never tried trying to recreate an the iso. It always seemed easier to play around with a bootable USB stick since you just need to dd the appropriate file onto the usb stick (and you don’t need to do anything with syslinux). I guess I’ve done everything using a linux host since that seemed easier from my perspective. One thing about the file you dd onto the usb stick is that it includes partition mappings etc … which means you end up with that partition 5 that you can just mount and update the oem.tgz in. My guess is that there is a rather different technique if you want to recreate the iso, since iso’s don’t generally have partition mappings. I guess this isn’t much help, but I’d probably recommend getting some kind of small VM running on your windows box, that is running some linux, that you could pass through the USB stick to, dd stuff to, mount etc .
grittyKitty, I think I was wrong about the de4x5 thing, and your trying to get it running as tulip is probably the right way to go. However I’m not sure I understand the interaction between the two. If you do an lsmod on your ubuntu running on the box with the quad card, do you see both the tulip module and the de4x5 module loaded? or just the tulip module? I notice in your source that you don’t compile de4x5.c. Actually I’m a bit confused about de4x5.c. Is it some leftover from an earlier version of the driver and is not used anymore in linux? There’s a whole heap of text at the start of de4x5.c sort of implying it could be used independently and that it is aimed at the 21142/43 chipset. It makes me wonder whether if you just created a build script for de4x5 and vmklinux_module bit and linked them whether it would work. I just did. Interestingly that does compile with no errors. I’ll have more of a look over the weekend
Thank you for your reply. Then I’ll try it with the USB-stick, too. But I didn’t get it working…
Why do I have to do exactly, when I don’t do this compiling stuff.
I downloaded the source code, but I think I don’t need this if I would compile it myself.
I tried to dd the dd-file from the iso-image (image.tgz | I took it from the “normal” downloaded esxi – not source) to my usb-stick. my ubuntu did it without errors. But when I tried to access the drive, it gave me strange file-names. When I tried to boot from the stick, it didn’t work.
What am I doing wrong.
btw: Is it possible to get E-Mail notifications when somebody else posts a reply?
Yeah, not too sure what’s going wrong for you. There are plenty of examples on the net of how to get esxi onto a usb stick. I just googled for ‘esxi usb stick dd’ and there are plenty of useful results.
darkdragon, I just pulled the dd file from the iso and then used WinImage on a WinXP system to write it to a blank 4GB USB stick. Worked like a charm. Used this link… http://www.techhead.co.uk/how-to-create-a-bootable-vmware-esxi-usb-pen-drive.
p.s. WinImage did not work on Windows 7, so I had to run it on a WinXP VM
kernel, I hae to check, but I’m pretty sure tulip module was loaded when using ubuntu, but need to verify before commiting… I was also wondering about compiling the de4x5 code using the tulip files I extracted from the linux-2-6-33 files.
I’ll play more with this on Sunday, when I have more time… thank you for the extra pair of eyes
Hello, I’m trying to compile sunhme module for my quad ethernet 10/100 card.
Unfortunately, using your solution, when I try to load module on ESX, it says :
May 1 10:35:19 node02 vmkernel: 0:00:47:44.486 cpu2:4104)Loading module sunhme …
May 1 10:35:19 node02 vmkernel: 0:00:47:44.487 cpu2:4104)WARNING: Elf: 1062: found undefined __this_module symbol
May 1 10:35:19 node02 vmkernel: 0:00:47:44.487 cpu2:4104)Elf: 2320: <sunhme> symbols tagged as <GPL>
May 1 10:35:19 node02 vmkernel: 0:00:47:44.493 cpu1:4104)WARNING: Elf: 1570: Relocation of symbol <sunhme_HeapID> failed: Unresolved symbol
May 1 10:35:19 node02 vmkernel: 0:00:47:44.493 cpu1:4104)WARNING: Elf: 1570: Relocation of symbol <sunhme_SkbHeapID> failed: Unresolved symbol
May 1 10:35:19 node02 vmkernel: 0:00:47:44.493 cpu1:4104)WARNING: Elf: 1570: Relocation of symbol <pci_map_rom> failed: Unresolved symbol
May 1 10:35:19 node02 vmkernel: 0:00:47:44.493 cpu1:4104)WARNING: Elf: 1570: Relocation of symbol <pci_unmap_rom> failed: Unresolved symbol
May 1 10:35:19 node02 vmkernel: 0:00:47:44.493 cpu1:4104)WARNING: Elf: 1570: Relocation of symbol <pci_unmap_rom> failed: Unresolved symbol
May 1 10:35:19 node02 vmkernel: 0:00:47:44.493 cpu1:4104)WARNING: Elf: 1570: Relocation of symbol <sunhme_HeapID> failed: Unresolved symbol
May 1 10:35:19 node02 vmkernel: 0:00:47:44.498 cpu1:4104)ALERT: Elf: 2518: Kernel module sunhme was loaded, but has no signature attached
May 1 10:35:19 node02 vmkernel: 0:00:47:44.498 cpu1:4104)WARNING: Elf: 2542: Kernel based module load of sunhme failed: Unresolved symbol <ElfRelocateFile global failed>
Do you have any idea of the problem ?
Best regards.
I read through your previous blog-post with the 3.5 version. Perhaps you should mention in this post, that you should read the first one, too.
I finally got the dd to my USB-stick. I didn’t set the parameter bs=32k last time. Where should I know it from?
But then it didn’t boot from it… But there was no error, too. How long did it take for you?
But it worked using plpbt for my vmware testing machine. Perhaps my computer has any problems booting from a USB…
I used the dd for windows. I went through many articles, but it either gave me an error, or it just stopped after a time (the progress number didn’t go up. I waited over half an hour). I formatted the stick with diskpart, too. Even if I don’t think this was useful. I started dd as admin and the command prompt as admin, too. I’m trying WinImage later (Thank you for the tip!). Perhaps it workes on Win7 in compatibility mode…
I was trying some other tools, too. Perhaps the problem is at the server. Because it didn’t work to directly boot from it (see further on the top). Perhaps I’m trying this with the plpbt trick, too later on.
I’m now trying to create a iso-image.
For this, I first took an dd-image from my USB stick again. Perhaps it’s better to use a virtual HardDisk with exactly 900MB than the resulting file is smaller. then I built the image.tgz (with the same directory tree and so on). Then I used ISO Master to replace the image.tgz file. Now I’m burning it on a CD. I’ll tell you later if it worked or not…
Winimage does the same as dd. it just stops (at 70% and then it does not get more (waited up to 15 min)) ( – and it’s not freeware).
Well I tried out the iso thing, and it didn’t work. (No network adapter found). Reason: see next step!
I tried out to boot from usb with help of plpbt. And it loaded, but it still tells me: “No compatible network adapter found.”
What am I doing wrong? I dd’d it to the USB-stick, and then replaced the oem.tgz file with the network-drivers (sky2.o).
perhaps my oem.tgz file is wrong…
I built it the following way: I put all contents of vmtest (etc and usr) and created a tar archive. I gzip’d it and then renamed it (using 7-zip in windows). So when I’m opening that file, there are the two directories (etc and usr) in there. Is this right?
Or do I have to put the sky2.o directly in that archive?
Now everything worked. I could manage it from the vSphere client.
But when I restarted it, there was nothing on the HDD.
So do you need to install it with the iso-image?
Hey that sounds like you have it working if you can talk to it using the vSphere client. The thing with the USB stick method is that you always boot off the USB stick and just use your real hard drives as storage. Having ESXi always on the USB stick is not such a bad thing as it’s relatively small and loads mostly into memory anyway (ie. there’s generally not much write activity back to the USB stick which would be relatively slow)
torrmkr, I had a quick go at a ‘first attempt’ compile of a sunhme out of a 2.6.26 kernel. I must admit I’m not too sure about some of the unresolved symbols you’re getting, namely the sunhme_HeapID, sunhme_SkbHeapID and the ElfRelocateFile (grittyKitty has highlighted these ElfRelocateFile issues as well). But the pci_map_rom and pci_unmap_rom ones are reasonably explainable. These routines are normally in pci.c in a linux kernel, but ESX doesn’t have them. We can either copy the ones from from the pci.c in a real linux kernel, or work out some other option. For now (since this is a first attempt), I’ve effectively commented the calls to pci_map_rom and pci_unmap_rom out. They’re only used to work out the MAC addresses of the interfaces. There’s an alternate piece of code in sunhme.c which simply uses a random number for the MAC address if if has a problem reading the ethernet card’s rom. That’s ‘good enough’ until we actually get the driver working. Anyway, I’ve uploaded sunhme-0.01.tar.gz . Like I said it’s a ‘first attempt’ and I suspect it looks very similar to your code changes. Tell me if it works any better or worse.
Hello, the module has been loaded succesfully.
Unfortunately, although I’ve added right PCI IDs, it doesn’t find any interface (driver hme claimed 0 device)
This is the output of /var/log/vmkernel
————————————–
May 2 11:18:15 node02 vmkernel: 0:00:21:38.301 cpu2:4106)sunhme loaded successfully.
May 2 11:18:15 node02 vmkernel: 0:00:21:38.301 cpu2:4106)ALERT: Elf: 2518: Kernel module sunhme was loaded, but has no signature attached
May 2 11:20:59 node02 vmkernel: 0:00:24:22.747 cpu2:4106)Loading module sunhme …
May 2 11:20:59 node02 vmkernel: 0:00:24:22.748 cpu2:4106)Elf: 2320: <sunhme> symbols tagged as <GPL>
May 2 11:20:59 node02 vmkernel: 0:00:24:22.758 cpu1:4106)module heap : Initial heap size : 102400, max heap size: 4194304
May 2 11:20:59 node02 vmkernel: 0:00:24:22.758 cpu1:4106)module heap sunhme: creation succeeded. id = 0x4100bbc00000
May 2 11:20:59 node02 vmkernel: 0:00:24:22.758 cpu1:4106)module skb heap : Initial heap size : 524288, max heap size: 23068672
May 2 11:20:59 node02 vmkernel: 0:00:24:22.759 cpu1:4106)module skb heap : creation succeeded
May 2 11:20:59 node02 vmkernel: 0:00:24:22.759 cpu1:4106)PCI: driver hme is looking for devices
May 2 11:20:59 node02 vmkernel: 0:00:24:22.759 cpu1:4106)PCI: Trying 0000:00:02.0
May 2 11:20:59 node02 vmkernel: 0:00:24:22.759 cpu1:4106)PCI: Trying 0000:00:02.1
May 2 11:20:59 node02 vmkernel: 0:00:24:22.759 cpu1:4106)PCI: Trying 0000:00:04.0
May 2 11:20:59 node02 vmkernel: 0:00:24:22.759 cpu1:4106)PCI: Trying 0000:00:04.1
May 2 11:20:59 node02 vmkernel: 0:00:24:22.759 cpu1:4106)PCI: Trying 0000:00:06.0
May 2 11:20:59 node02 vmkernel: 0:00:24:22.759 cpu1:4106)PCI: Trying 0000:00:0a.0
May 2 11:20:59 node02 vmkernel: 0:00:24:22.759 cpu1:4106)PCI: driver hme claimed 0 device
May 2 11:20:59 node02 vmkernel: 0:00:24:22.759 cpu1:4106)Mod: 2986: Initialization for sunhme succeeded with module ID 57.
May 2 11:20:59 node02 vmkernel: 0:00:24:22.759 cpu1:4106)sunhme loaded successfully.
————————————–
After some retries, I decided to create a new file called /etc/vmware/pciid/hme.xml that contains the new PCI addresses :
<?xml version=’1.0′ encoding=’iso-8859-1′?>
<pcitable>
<vendor id=”108e”>
<short>Sun</short>
<name>Sun Microsystems Computer Corp.</name>
<device id=”1001″>
<vmware label=”nic”>
<driver>sunhme</driver>
</vmware>
<name>Sun Happy Meal</name>
<table file=”pcitable” module=”ignore” />
<table file=”pcitable.Linux” module=”sunhme”>
<desc>Sun Microsystems Computer Corp.|Sun Happy Meal</desc>
</table>
</device>
<device id=”1000″>
<vmware label=”nic”>
<driver>sunhme</driver>
</vmware>
<name>Sun Happy Meal</name>
<table file=”pcitable” module=”ignore” />
<table file=”pcitable.Linux” module=”sunhme”>
<desc>Sun Microsystems Computer Corp.|Sun Happy Meal</desc>
</table>
</device>
</vendor>
</pcitable>
Then I recalled the executable esxcfg-pciid that recreated correct simple.map (I’m under ESX 4 and I didn’t create an oem.tgz).
Now it works correctly, it sees all the NICs and they have been added to vSwitch to simulate NIC TEAMING. So, the driver works !!!
The only problem I have is that on reboot I must manually run the command “vmkload_mod sunhme” to load the module. Is there a way to make this automatically ?
Best regards.
I solved this issue by editing /etc/rc.local file. Unfortunately the esxcfg-module didn’t work as it should be (enabling module on boot).
[root@node02 ~]# cat /etc/rc.local
#!/bin/sh
#
# This script will be executed *after* all the other init scripts.
# You can put your own initialization stuff in here if you don’t
# want to do the full Sys V style init stuff.
vmkload_mod sunhme
touch /var/lock/subsys/local
[root@node02 ~]#
Best regards.
torrmkr, that sounds like you have it working. I must admit I don’t quite understand the problem you’re seeing re not detecting the sunhme devices, and the use of that xml file and esxcfg-pciid to resolve it, but if it works for you that’s great (I’ve only ever used the oem.tgz and simple.map on ESXi 4 per my post). I was thinking a bit more about how to avoid the random MAC addresses, and a simple cheap workaround might be to change the sunhme.c code from;
to something like;
Still not the greatest solution, but at least you know what the MAC addresses will be.
Hello Kernel,
Just an update. I had to give up on ESXi 4 and trying to compile the tulip.o modules.
Ultimately switched to using VMware Server 2.0.2 on CentOS 5.4 x64. Everything configured fine since CentOS includes the tulip.o modules.
Ran VMware Server 2.0.2 for the past month and I’d have to give it a 6/10. The web interface continually dies every time I try to update, create new VM’s and I must constantly restart the web host. VMware opted for a web-based GUI to manage VM’s and I’d have to agree with a lot of the blogs out there, quite frankly, it sucks and renders the product semi-usable. Also, sometimes my VM’s will just blink off after a few days with no rhyme or reason. This is a development box, but some stability would be nice.
I’m now switching to using Win2K3 Server x64 Standard with VMware Server 2.0.2. I’ll report back after this experience.
Again, thanx for your input.
Best…
grittykitty@kissthesquirrel.com
Thanks for the update grittyKitty. That’s too bad re the tulip driver. Some of these drivers seem to be harder than others to get ‘working’.
Worked for me. Using Marvell Yukon 88E8071 (identified by lspci as Galileo Tech Ltd 88E8071 network adapter, venID:prodID 11ab:436b) – using sky2 driver.
Machine, for reference, is Acer Aspire M3300 Athlon 620 x4
Awesome! Million thanks.
^^^ never mind… The 88E8071 works at the beginning, ping is good and I can load the HTTP homepage but then it just hangs all nw traffic, no ping reply no restart no nothing. Oh well, it was worth the try.
That’s no good. I was thinking that it was a newer Marvell chipset, but it looks like an older one (it’s referred to in the older sky2.c sources). I guess check the various ESXi log files to see if there is any debug messages about what’s happening. Also try ‘esxcfg-nics -l’ to check the interface status to see if that has any useful info.
Thanks for reply.
Logs show no sign of failure on network side. The esxcfg command shows link is up & normal, sky2 driver loaded etc. A patch might be needed as described here http://www.mail-archive.com/netdev@vger.kernel.org/msg39532.html but not sure if that would work.
Another thing that might be wrong – I am trying your binaries on ESXI 4.0 Update 1 – maybe some kernel incompatibility? I just had the server machine freeze on me when trying to download the vSphere over web interface. I’m pondering whether it is cheaper to just buy a NIC.
shadowncs, I had a look through the patch on that http://www.mail-archive.com/netdev@vger.kernel.org/msg39532.html link, and compared a lot of it against the driver code in the sky2.c and sky2.h source I was using. The source I was using seems like it already had something similar applied anyway. But, I did have a go at manually applying that patch. I’ve compiled it and uploaded it as sky2-for-esxi4-88e8071-test1.tar.gz. Feel free to try it and see if it works any better. I have low expectations though.
Hello Kernel,
I have some problem whith driver. I’m write all here http://www.vm-help.com/forum/viewtopic.php?f=12&t=2421&start=0. May be you help me?
Best regards.
I’ll have a look if I get some time (am a bit busy lately). I guess I would first try a regular linux distro on this system, using the same kernel version that you’re using as a base for your esxi driver, and confirm that the rhine driver can detect and work with that card.
mav1, OK, I had a bit of look into this. I guess the driver you are trying is based on the GPL source that D-link released. I find the built in drivers that come with a linux kernel are a better source when you are trying to adapt a driver for ESXi. Basically, ESXi ‘looks like’ an older 2.6 kernel. So, like a lot of my attempts I used a driver from 2.6.26. From the research I did it looks like the via-rhine driver in the normal linux kernel is meant to work with your DFE 520TX card, but I can’t really confirm that since I don’t have such a card. So I adapted the 2.6.26 via-rhine driver so that it would compile. I had to comment out a heap of mii related stuff in the source to get rid of a lot of unresolved symbols. My thoughts are that the card may still configure itself with these bits missing and might work ‘poorly’ … but again, I’m not too sure until you try it. I also had to comment out some code related to ‘Rhine I’ cards that are based on the VT86C100A chip. I don’t think your DFE card has one of those chips. I’ve uploaded via-rhine-for-esxi4-0.01.tar.gz. That contains a compiled driver, source and an example simple.map. I took a guess and guessed that your PCI IDs (for the simple.map) are 1106:3106. If they’re different, you’ll need to edit the simple.map and change the line with 1106:3106 on it to match your PCI IDs. NB: This is just what I would call a ‘first attempt’. Don’t expect too much.
Hello all. I hope my request that I’m about to make doesn’t annoy or frustrate anyone but this is my dilemma, I’m not Linux savvy at all I’m more of a GUI guy but i know this whole process is all console work. I have a Marvell Yukon 88e8056 with a PID of 11AB:4364. Is it possible for someone to create the files for me and i replace them in the ISO and install? or does it have to be installed and then i replace the files? Any bit of guidance is appreciated. thanks all.
Hello kernel,
I’m test your driver. It’s works!!! Thank you very much for help.
With best wishes.
mav1, that’s great news. You should try and copy some large files across the network to get a feel for the performance. Like I said, I commented out a lot of the mii stuff in the driver .. which is often used for setting duplex modes etc and ultimately I wasnt sure just how critical those bits were. If the duplexing is wrong, your network performance will likely suffer.
Jinx, maybe someone else reading this page can you help you. I’m a bit too busy to provide much help at the moment. I guess I always found the USB stick install of ESXi to be pretty easy, and if you google you can probably find some Windows howtos on how to set it up (I’m assuming you’re a Windows user. Apologies if you’re not). The key benefit of installing to USB stick is that you can read and write to it easily. I haven’t tried plugging the USB stick into Windows, but the main partitions on it that you need to copy/edit things on are all FAT partitons, so Windows should be able to mount them. There’s a few comments above by darkdragon that might help point you in the right direction. re your 88E8056 chipset NIC. That’s very similar to the 88E8053 on the motherboard that I have … and I can see that the sky2 driver thing will most likely work … assuming you can create your oem.tgz and get it onto the USB stick. Have a look at my post on setting up the Marvell LAN on ESXi 3.5 (linked to in the main article above). There’s a bunch of stuff in there that is quite relevant … and might help point you in the right direction.
Hi,
Not working here… Using Marvell Yukon 88E8057 (Galileo Tech Ltd). Using sky driver and module loads but esxcfg-vmknic -l returns nothing. PID is 11ab:4380. Any thoughts would be very appreciated.
Regards
naik, there’s a PCI ID table near the start of the sky2.c file. It lists all the card IDs. It didn’t include 4380, so I’ve added it in and recompiled. I’ve uploaded sky2-and-skge-for-esxi4-0.03-test1.tar.gz which includes the new sky2.o. Do you want to give that a try and see how you go.
I will and get back to you, many thanks
Hi
The driver loads but produce the error:
sky2 0000:03:00.0 unsupported chip type 0xba
lspci shows the module is loaded for 11ab:4380
I have a fedora kernel driver source for 2.6.x if that helps:
http://www.syscore.se/install_v10.85.9.3.tar.bz2
regards
OK, I’ve added in some more of the chipset detection stuff from the later drivers. Can you try this; sky2-and-skge-for-esxi4-0.03-test2.tar.gz
OK.
The driver loads. The chip and the two controllers are detected.
But it seems any protocol access to the host hangs the server.
I’ve been reading about workarounds regarding the chip and tried the following:
“ethtool -K vmnic0 sg off” and “ethtool –offload vmnic0 rx off tx
off sg off tso off” but get the error “function not implemented”
I can ping the host but thats it. Unfortunately it may be related to the previous post:http://www.mail-archive.com/netdev@vger.kernel.org/msg39532.html
Any thoughts?
Regards
Try this one ; sky2-and-skge-for-esxi4-0.03-test3.tar.gz . Only has some minor changes from a later linux driver.
Now I get som more specific error:
Warning: LinNet: netdev_watchdog vmnic0: transmit timed out
vmnic0:tx timeout
transmit ring 477..41 report=41 done=41
BUG:warning at vmkdrivers/src_v4/vmklinux26/vmware/linux_net.c:3239/netdev_watchdog() (inside vmklinux)
sky2 vmnic0 disabling interface
..tx, rx, sg, autoneg do not work with ethtool: function not implemented
regards
I’m having the same problem in ESXi 4.1 with Marvell Yukon 88E8056
“netdev_watchdog vmnic0: transmit timed out”
sorry, it’s 88E8057
Hi again,
Do you have any more input regarding the Marvell Yukon that could push my case forward?
I have a question of a more private matter, if you want you can reach me at this temporay email address:
MgRQcqbcjTy5i3Rj@spam.seydisehirmyo.net
Regards
Sorry, I guess I’m coming to the conclusion that there is something a bit harder to work out with your card. I’ll try to have another look, but I won’t have much time until the weekend.
I have an Intel Entry Server Board (SE7221BA1-E) with an on-board Marvel 88E8050 card (11AB:4361).
Currently it is running with VMware ESXi 4.1.0 260247 with your sky2.o driver. All that was required to make it work was add the following line to the simple.map file:
11ab:4361 0000:0000 network sky2.o
Thank you very much for your excellent work porting this driver to ESXi.
I have done some more digging around the 88E8057 driver. Earlier the chip was supported under the skge and sk98lin packages. When the code was introduced in sky2 compíled with different kernels it stopped working, for most of the distributions. Here are some sucess stories:
https://forums.openfiler.com/viewtopic.php?id=4894
http://linux.derkeiler.com/Newsgroups/comp.os.linux.networking/2009-02/msg00110.html
http://tima-sls.imag.fr/viewgit/rabbits/?a=viewblob&p=Linux&h=daf961ab68bc27e3fa16ddea89784aa0091b42e2&f=drivers/net/sky2.c
Unfortunately that’s all I can help you with. I understand your time is limited and are greatful for your support. I’ve also been in contact with Marvel and are awaiting their answer. My earlier email address didn’t work. Here’s a new one: marvellyukon2010@live.se
naik,
I’ve gone through a much later kernel (2.6.33.7). Try this one ; sky2-and-skge-for-esxi4-0.03-test4.tar.gz .
Hello Kenel, your post about ESXi kernel module literally blowed up whole VMWare Community, THX for that!
Currently Im trying linux soft raid to get work with ESX, could you please tell your opinion about it, is it possible?
Here is my output when Im trying to load module on ESX:
[root@ESX ~]# vmkload_mod -v100 vmkdriver-raid1-KLnext/release/vmkernel64/raid1.o
Verbose:100
After loop: argc:3 optind:2 optarg:(null)
modParams is || (0)
nameIn:vmkdriver-raid1-KLnext/release/vmkernel64/raid1.o
nameOut:raid1 pathOut:vmkdriver-raid1-KLnext/release/vmkernel64/raid1.o
fd:4 size:-2346392 st_size:1096059
VmkModSign_GetSignInfo starting
vmsignContent does not exist
name:raid1 addr:0xf7b65000 size:0x10c000 modParams:
vmkmod: failed at bora/lib/vmkmod/vmkmod.c:VMKMod_LoadModule:240
vmkmod: failed at bora/lib/vmkmod/vmkmod.c:VMKMod_Load:437
vmkload_mod: Can not load module vmkdriver-raid1-KLnext/release/vmkernel64/raid1.o: Unresolved symbol
Some time back I did look into porting the md raid driver stuff in linux to ESXi … but quickly concluded it would be a lot of hard work. I haven’t looked at it since.
Possibly not what you’re after … but there are some interesting sata chipsets made by Jmicron that can seemingly do proper hardware RAID, such as the JMicron 393. They aren’t the cheapest, nor are they easy to find, but they are more likely to come down in price compared to say an Adaptec controller. Here’s a review of two of those Jmicron 393 cards. Admittedly I have not tried out one of these cards.
hi @kernel
thanks for all your hard work on the 88E8057 card so far.
i have a Shuttle SX58J3 which has two of them on board.
using the latest sky2-and-skge-for-esxi4-0.03-test4.tar.gz driver you kindly provided us, the cards are recognized, but do not answer to pings neither they get an IP from DHCP.
is there something that would help you investigating? some kind of debug output or whatever?
Do you have such a NIC in an ESXi compatible system? would probably be helpful. seems like a good reason to donate your work! where is the paypal button?
Thanks for the feedback on the sky2-and-skge-for-esxi4-0.03-test4.tar.gz driver. I too am puzzled why this 88E8057 is so hard to make work. From your comment, I am assuming that the module loads OK and esxcfg-nics -l returns what looks like a configured interface.
I only have an old motherboard with an 88E8053 on it, so any updates to the driver are in many ways ‘compiling blind’ since I don’t have a physical 88E8057 card or motherboard to test with.
One thing I’m interested in finding out is whether the card actually does work on a recent linux kernel (if you’re using windows, you could just try booting a recent Ubuntu live CD and see whether you have network connectivity … without actually installing linux). The recent linux sky2 driver that I’m using as reference appears to support the 88E8057 … but just because it’s mentioned in the source code doesn’t mean it actually works.
i can confirm now, that it works in a ubuntu 10.4.1 live cd with kernel 2.6.32-24-generic
i’m trying out no an older version, like 8.04 and also the latest knoppix.
if there is anything else i can do, just tell me. i’m using windows but am quite familiar with linux, so i could also install a CentOS or whatever if it helps.
forzyte, Thanks for the feedback re the 10.4.1 live CD. Given that I based that test4 code on 2.6.33.7, I’m guessing there is something a bit more fundamental missing, in order for it not to work on ESXi. I’ll have another look at the code. There are always parts of the driver that I end up commenting out (which I gauge as non-critical), because some function calls have changed or don’t exist in the ESXi kernel. Might be the weekend before I get much of a chance to look at it.
hi, i read that some of you guys got marvell yukon 8056 running.
i got an asus p7f-x. i’ve copied the imagedd of esxi 4.1 to an usb stick. now i can start the server an esxi comes up and tells that it cant find any compatible nics.
i downloaded sky2 driver and added it to lib path in oem.tgz. of course i added simple.map and the PID.
now esxi outputs the same message. it cannot find any nics.
i looked up simple map on real path and for me it seems that the system isn’t copying the new simple.map file.
the sky2 is copied!
any idea, someone?
thanks!
another thing: i did not mount the usb stick to an unix. i worked directly on the esxi console!
irise, I am a bit puzzled how you created the oem.tgz from the esxi console. Did you somehow copy the sky2.o and simple.map from Windows first into one of the FAT partitions on the USB stick, then try and tar them up from on the ESXi console?
i’ve dumped imagedd to usb stick via winimage.
but now i got it running after editing Hypervisor1 partition and oem.tgz with an ubuntu live cd.
now i got another problem. esxi won’t get an dhcp lease. and if i configure it manually, i’m neither able to ping it via crossover cable nor via LAN. esxi isn’t able to ping anything, too.
any idea?
i used sky2.o and interface state is connected. i also use dmraid for intel ich8r support in oem.tgz.
i’ve found the following lines in message log:
pci: driver sky2 claimed device …
…
vmnic0 not yet opened
…
pci: driver sky2 claimed 2 devices
…
uplink: 12981: opening device vmnic0
…
sky2 vmnic0: enabling the interface
netport: 982: enabled port 0×2 with mac 00:00:00:00:00:00
uplink: 131119: vmnic0 is opened
…
mod: 4163: initialization of sky2 succeeded with module ID31.
sky2 loaded successfully.
…
ALERT: Elf: 3028: Kernel module sky2 was loaded, but has no signature attached
same with vmnic1 in between. i’ve only posted messages that seems important for me.
i donno how to set up the interface with a valid mac address.
is there something like ifconfig on esxi console?
irise, Have you tried disabling one of the nics in the BIOS? (just to see if it configures properly in that case). Also, there are multiple attempts at updates to the sky2 driver if you look back through the comments above. The latest one (sky2-and-skge-for-esxi4-0.03-test4.tar.gz) is not necessarily the best one for your card. I’d probably try all of them.
If you look back at the comments above, I can see that samarium had a 88E8056 in his PC, but the discussion at that point was all about getting a 88E8001 running using the skge driver. But there are comments by eo29 who had a P5Q deluxe board (which has a 88E8056 and a 88E8001 I think) … and he confirmed he was able to get both to work.
hi, at this moment i am using sky2-and-skge-for-esxi4-0.03-test4.tar.gz.
i will try older ones in the next days.
at vm-help.com forum i’ve read that some guys got 8056 running with sky2.
i will test and confirm weather that’s right or wrong.
thanks.
Hi, i got it running with sky2 out of sky2-0.02 package.
now i want to get raid1 running on p7f-x.
but dmraid is not working with ich8r.
think i have to buy a pci raid module.
irise, thanks for the feedback. The sky2-and-skge-for-esxi4-0.03-test1/2/3/… are all various attempts to add in support for the 88E8057 card. There has been a lot of rewriting in each, since in each case so far my attempts did not work, so maybe I’ve introduced a bug into those 0.03 ones that prevent the whole driver from working.
Anyway, for reference for anyone else reading these comments; the sky2-and-skge-for-esxi4-0.02.tar.gz should support a bunch of chipsets including the 88E8053, 88E8001, 88E8056 (and possibly other Marvell nics). But the sky2-and-skge-for-esxi4-0.03-test1/2/3… downloads are various atempts to get support for the 88E8057. As it stands none of these work correctly for the 88E8057.
I was able to install ESXi 4.1 on a Macmini2,1 with your sky2 driver. Thanks a million!
Ever tried using the marvell provided sk98lin driver? I’m trying to get a 88E8057 working as well.
I have had a brief look at the marvell sk98lin driver. It looked significantly different to the sky2 and skge drivers in the linux kernel, so I didn’t look any further (ie. it looked like a painful effort to port it). I still think there might be some minor error in some of my more later attempts to get the driver working on the 88E8057 (ie. the sky2-and-skge-for-esxi4-0.03-test1/2/3 downloads), as I think these later ones don’t even work with the older Marvell chipsets that do work with the sky2-and-skge-for-esxi4-0.02.tar.gz version). For reference (from memory) the 0.02 release uses a modified driver from a 2.6.26 kernel as this is ‘close’ to the linux-ish kernel in ESXi, and all the 0.03 test versions incorporate code from later linux kernel versions where supposedly the 88E8057 is supported.
Yes, it does look very different with all the compilation units. I was thinking just linking all the resultant objects together would do, but don’t have the skills to do it.
What is the latest sky.c version (its defined in the file) that you’ve tried? I’m thinking of working my way down from 1.28 (that’s where new DMA defines were introduced, which would require kernel level modification) till I get a working compile.
Either way I have to get this working, can’t afford a slot for a supported card.
My most recent attempt used the sky2.c out of 2.6.33.7 (which is v1.26). I am pretty sure the PCI defines for the 88E8057 have been around for a while in sky2.c. I looked at several kernels during my various attempts. I have a 2.6.32.24 that has v1.25, and does include the PCI defines for the 88E8057. I also looked at 2.6.31 which includes v1.23 and even that has the defines for the 88E8057. The older sky2.c in the sky2-and-skge-for-esxi4-0.02.tar.gz download I have is from 2.6.26 (but out of a debian lenny source) and it is based on 1.21
There’s a comment from forzyte higher up indicating that the 88e8057 works on ubuntu with a 2.6.32-24 kernel so that seems to imply that v1.25 should definitely work.
Okay, the highest version I am able to compile is 1.22; before changes in skbuff and netdevice made things too complicated for me. (how did you get past all the changes in struct net_device, pci_dma_mapping_error, vlan_gro_receive, etc?)
The compiled driver is pretty much useless but actually works for the first minute. I was able to get a DHCP lease and ping from within the console but as time went on, the server starts degenerating. Logs seemed to show that one of the PCPUs was locking up (no heartbeat for 60s) and a NMI was generated. Eventually I got a PSOD about a deadlock.
I’m looking into it, but would anyone else be interested in my work so far (it is pretty much what kernel has done so far but on the 1.22 driver)? If so I will spend some time cleaning up my current work attach the core dumps and backtraces and package it online.
Oh, the stack trace seems to point to:
sky2_poll
napi_poll
Now I know where to look.
(sorry about the multiple comments in a row)
I’ve tracked the lockup to the following piece of code (its somewhere in sky2_poll):
while ((idx = sky2_read16(hw, STAT_PUT_IDX)) != hw->st_idx) {
work_done += sky2_status_intr(hw, work_limit – work_done, idx);
if (work_done >= work_limit)
goto done;
}
It appears that eventually the while condition stays true and work_done does not increment anymore.
I don’t fully understand the driver yet (what that’s about 5000 lines to digest there), but I’m going to try looking into how this condition is reached.
After that, I’ll just do a simple workaround that breaks out of the loop after X loops and see what happens.
Hey thanks for looking so deeply. Much appreciated
Okay, that was disappointing…
It appears that the status ring buffer is only 512 (bytes?) long, but the code was set to 2048. (That’s my conjecture at this point, it may turn out to be that some extra initilisation needs to be done, or it is a bug in the code causing this)
To get it to work, change the define for STATUS_RING_SIZE from 2048 to 512 in sky2.c.
I’ll leave the server doing a whole lot of pings and repeatedly loading the web client from my computer and pinging the server overnight, just to test the stability. No problems so far for the past 15 minutes and CPU utilisation is low (judged from the CPU temperature).
If the server isn’t on fire tomorrow when I return, I’ll remove the debug code, package up and leave a copy here. Or kernel, you can actually do it on your side as well, though I’m not sure what changing that will do to other chips (maybe you can test it out?).
The log seems to be complaining about WOL not available:
Hostd: Unable to enable WOL Operation not supported: Operation not supported
WOL capability for nic cannot be turned on.probable bug file against driver: sky2
and the module being unsigned:
Kernel module sky2 was loaded, but has no signature attached
But I can definitely say that I have the 88E8057 running in ESXi. =)
Alright, the driver never missed a tick during the last 8 hours. ping tells me that I have about an average roundtrip time of 0.5ms with a maximum of 4ms over the local network. There were no connection timeouts loading the management page as well.
I’ve rebooted and enabled both ports and they both work well with failover within 1s if I unplugged the active port.
There here it is:
http://www.mediafire.com/file/48dd012ap4xlufy/sky2-1.22-for-esxi4-88e8057-r1-xieliwei.tar.gz
If kernel won’t mind, would you mirror a copy?
I’ll be sticking around here a while to see whether anything bad turns up with the driver. I’m kind of satisfied with what I have now (its just for management only) and probably won’t be looking into the 512 limitation unless somebody points out what could be wrong.
And thanks kernel for most of the work, I wouldn’t have tried if you hadn’t! =)
(PS, can I place the oem file somewhere such that the ESXi installer can use the sky2 driver? I definitely have to install to harddrive because USB booting fails very regularly with the stick I’m using.)
xieliwei, that is fantastic news for people with 88E8057s. Thanks for the effort. I’ll keep a copy of your driver here as a mirror. And I’ll also put an update in the main post so that it’s easier to find.
That is odd re the STATUS_RING_SIZE. I’ve noticed looking at later revisions of the (linux kernel) driver, that subtle changes have been made to some of those other #defines in the same section of the code.
Happy to mirror your oem.tgz if that helps. Thanks again.
While the new driver for the 88e8057 does work in my laptop, it is losing 67% of the packets and has ping times of 1000 ms with an occasional 90 ms. I will try to play with it a little more.
After a restart pinging from the unsupported console to another computer yielded 100% packet loss. Pinging loopback yields good pings but I then tried to ping the external address again and then everything works as expected. I am on esxi 4.0 update 2. i have not tried 4.1 yet.
well after a few more minutes, pings are errratic again.
Maybe its something to do with the esxi version you’re using, are you able to update to 4.1?
yes, I will download 4.1 tonight and check it out. Thanks for the hard work.
After switchimg to 4.1 the sky2 driver appears to work at least for management on my 88e8057.
garyfloyd, does that mean that you are still experiencing problems?
I did a install of XP last night using the vmnet bridged interface to the 88e8057 and ran windows update overnight, it completed okay.
I wonder what has changed between 4.0 and 4.1, sounds like it was almost working in 4.0 though. For others who wish to try the driver on 4.0, here is a oem.tgz for my debug version of the driver, just rename to oem.tgz:
http://www.mediafire.com/file/m836gce3r3b7ygi/sky2-1.22-for-esxi4-88e8057-r1-oem-debug-xieliwei.tgz
By debug I mean lots of amateurly placed kprintf statements. The vmkernel console will be spewing out status statements for every RX frame.
I’m interested in what the statements are before and when the driver starts/is failing. If anyone wishes to try, please copy the messages.* files from /var/log out and provide me a copy.
Take note that if the driver eventually goes into an endless loop, the messages.* files will be overwritten when the total log size reaches about 7MB, please copy the logs out before then.
Kernel’s method of obtaining the simple.map file from a running system is a good method for getting files off ESX with a non-functioning NIC.
the driver appears to be working corectly with 4.1. however, that is just based on pings and the vsphere client. I have not used that interface for actual vm traffic yet.
it works also for me. (esxi 4.1 on marvell/galileo 11ab:4380 – mod.88e8057) pc is lenovo thinkcentre 7303-wp5.
Tested iscsi with openfiler for a couple of hours continuos disk access with hdspeed.exe on a vm! AWESOME!
one question:
can i do think trick on ESX 4.1 ?thx!
from italy with love, an happy sysamin!
Yay! =)
Is that question directed to me? I’m not sure what think trick is. Assuming you mean ‘this’ instead of ‘think’, I cannot guarantee that it would work on ESX since I don’t have an ESX license. However, since ESXi is just the core version of ESX, I don’t see why it wouldn’t work.
Just a heads up after my 3 days worth of testing; it appears that the driver can PSOD when disabling the interface from the yellow management interface. I couldn’t replicate it but it did happen twice, both involving the sky2_poll() function. My hunch is a race condition bug brought in from the original driver. It shouldn’t be a big problem though as one normally wouldn’t disable an interface.
I think ESX and ESXi are fundamentally the same thing at the basic kernel level that we’re looking at here
I don’t see it mentioned in the comments above, but VMware snuck in a sky2 into 4.1. I’ve posted some details here – http://www.vm-help.com/esx41/sky2_driver.php.
Thanks Dave. I still need to get around to looking at 4.1. So does the installer include the sky2 driver, or is it just the source code included in the oss files?
I seem to recall seeing a sky2.o module the first time I tried getting the 88e8057 working, but I was lacking sleep at that time… Somehow after modifying the simple.map file to load that module, it reported missing symbols.
I just upgraded my Dell T110 server from ESXi 4.0 to 4.1. The intention was to get a new Galileo NIC working as a secondary port, and I read that the sky2 driver was present in 4.1. This Galileo uses the Marvell 88E8075 (11ab:4370).
After performing the simple.map and pci.ids customizations, I could get the machine to recognize the card as something it could passthru, but not as a NIC for the host. Looking in the logs revealed the same undefined symbol problem with bitreverse as described above.
I used the driver from sky2-1.22-for-esxi4-88e8057-r1-xieliwei.tar.gz straight up using the same oem.tgz mechanism, and the host was able to recognize it right off. Very cool. Thanks a million!
Just tested xieliweis driver on a Shuttle SX58J3 with 88E8057 onboard.
management and simple VM Network seem to work smoothly, thanks a lot!!!
[...] digging around revealed the steps necessary to create this homebrewed version. The author provides a link to a precompiled version, but this doesn’t (yet) have the [...]
hi samarium
I am also using Asus P5Q-E
The below are what I have found
marvell 88e8056
pci id: 11ab:4364
marvell 88e8001
pci id: 11ab:4320
When I load sky_2.o the was discovered successfully.
however when I put in skge.o together. Instead marvell 88e8056 was not found but marvell 88e8001 was loaded. I discover the screen with red text stating
0:00:00:23.538 cpu2:4808)Elf: 3043: Kernel module skge was loaded, but has no signature attached.
Were you able to load both drivers?
If I were to load just skge.o alone, I would also encounter
0:00:00:23.538 cpu2:4808)Elf: 3043: Kernel module skge was loaded, but has no signature attached.
marvell 88e8001 will still be load.
It seems like I cannot get to have both Marvell nic running. There is some signature error on skge.o
I have tried both version
Fantastic! I was able to get the integrated Marvell Technology Group Ltd. 88E8056 PCI-E Gigabit Ethernet Controller NIC (11ab:4364) working in my Shuttle SG31 (SG31G20) with the Sky2 driver posted here, and after some editing of pci.ids and simple.map.