(Note: This post was initially written when ESXi 4.0 was available. As of late 2010, ESXi 4.1 has been released, and it does actually include a sky2 driver that may or may not work with various Marvell LAN chipsets. The post is still relevant (especially the comments) if your particular Marvell chipset does not work with the sky2 driver in ESXI 4.1. Also, the post is relevant if you’re interested in porting other network drivers to ESXi)
Well, after somehow getting my Marvell LAN card working with ESXi 3.5u4 (and u3) I thought I’d have a look at ESXi 4. Again I somehow got it to go. I’m not too sure how good it works, but it works well enough for me at home. If you can’t be bothered reading about me going on and on and on and on about how to compile it, then just scroll to the bottom of the post. The download for the source includes a precompiled module (NB: As per the post about ESXi 3.5, this is all about getting a 88E8053 chipset Marvell LAN working).
ESX/ESXi 4 is quite different from 3.5. The build chain is similar to 64 bit Redhat/Centos 5.2, so I ended up installing a x86_64 Centos 5.3 inside a vmware fusion machine to do my dev work. I just made sure I installed all the dev stuff. Then I downloaded the VMware-esx-public-source-4.0-162945.tar.gz from VMware’s open source page. It’s a much bigger file (590MB) than the file for 3.5. When you extract the file you end up with a lot of rpm files plus a vmkdrivers-gpl.tgz file. I did the following to extract it all on my test machine;
cd ~ mkdir vmware-oss cd vmware-oss tar xvzf ~/VMware-esx-public-source-4.0-162945.tar.gz mkdir drivers cd drivers tar xvzf ../vmkdrivers-gpl.tgz
One of the rpm files included is a kernel source rpm. I’m not exactly sure what it is relevant to ESX, but I installed it anyway for reference. I found I needed the qt-devel and gtk2-devel packages first;
cd ~ cd vmware-oss yum install qt-devel yum install gtk2-devel rpm -iv kernel-sourcecode-400.2.6.18-128.1.1.0.4.159770.x86_64.rpm
I’m pretty sure you don’t need the kernel source to build the drivers, but I kept it handy for reference anyway.
You can try doing a test build of the drivers now. This will build all the drivers built in to ESX/ESXi.
cd ~/vmware-oss/drivers ./build-vmkdrivers.sh
You’ll probably get a few warnings, but it should complete. If you do a find down the ‘bora’ directory you should see a bunch of .o files corresponding to the kernel modules (look under the ‘bora/build/scons/build’ directory).
OK, my approach was to look at the build-vmkdrivers.sh script and basically look at what was done to compile one network driver (I used the forcedeth driver as a reference) and just make a reduced script for my sky2 driver. As for the source to base the sky2 driver on, instead of using a driver from 2.4.37 like I did with the 3.5 version of the network driver, this time I ended up using the sky2 source from 2.6.26 (or the debian lenny incantation of it). I did originally use the sky2 driver from the kernel-sourcecode…2.6.18…rpm file, but on closer inspection of the tg3 driver that ESX uses, I noticed it actually comes from a 2.6.24.1 kernel (or thereabouts) … so I thought I may as well use a more modern reference source. I had a few minor hiccups trying to get it to compile, but in the end I just had the following shoved into the top of my sky2.c file;
/* Stuff for ESX compile */ #define upper_32_bits(n) ((u32)(((n) >> 16) >> 16)) #define csum_offset csum #define bool int #define PTR_ALIGN(p, a) ((typeof(p))ALIGN((unsigned long)(p), (a))) u32 bitreverse(u32 x) { x = (x >> 16) | (x << 16); x = (x >> 8 & 0x00ff00ff) | (x << 8 & 0xff00ff00); x = (x >> 4 & 0x0f0f0f0f) | (x << 4 & 0xf0f0f0f0); x = (x >> 2 & 0x33333333) | (x << 2 & 0xcccccccc); x = (x >> 1 & 0x55555555) | (x << 1 & 0xaaaaaaaa); return x; }
The define’s are to remedy compilation errors, and the bitreverse is to satisfy an undefined symbol problem. Note that the undefined symbols errors end up in /var/log/messages on your ESXi 4 box now (unlike 3.5).
So in the end to compile my driver I did a;
./build-sky2.sh
You get a couple of warnings, but if you do a ‘find . -name sky2.o’ you should end up with two sky2.o files. There is a DEBUG and DASHG variable defined at the top of the build script. If you uncomment these it’ll build a lot of debug stuff into the modules.
If you get some errors, its probably because some directories are missing in the build path, so make them first;
mkdir -p bora/build/scons/build/vmkdriver-sky2.o/release/vmkernel64/SUBDIRS/vmkdrivers/src26/drivers/net/sky2 mkdir -p bora/build/scons/build/vmkdriver-sky2.o/release/vmkernel64/SUBDIRS/vmkdrivers/src26/common/
Now, again I used a USB stick with ESXi 4 (build 171294 in my case). To install it, I loopback mounted the VMware ESXi 4 iso file file, extracted the image.tgz file to a temp directory, bunzip2’d the big dd image file, then dd’d it to the whole USB stick (NB: There are some details about how to do this in linux on my other post re ESXi 3.5. See link at the top of the post)
A difference this time is that I didn’t have a simple.map file all pre-prepared to go into the oem.tgz file. I thought the easiest way to get it would be to just boot ESXi and let it fail when it tries to configure a network device, then somehow copy the simple.map file off. So I did this. ESXi 4 merrily boots and eventually you see the dreaded ‘lvmdriver failed’ message. It looks like ESXi is broken at that point, but just type the word ‘unsupported’ and you get a password prompt, and just hit ENTER to get a prompt (You might need to hit alt-f1 first before typing ‘unsupported’)
Because networking is not working, we’ll just copy the simple.map to the Hypervisor1 partition;
cp /etc/vmware/simple.map /vmfs/volumes/Hypervisor1
I just did a ‘sync’ and held in the power switch ( perhaps type ‘reboot’ if you feel like being more careful). Now get the USB stick to appear as a USB device in your development VM (your centos 5.x environment), and mount partition 5 (or the Hypervisor1 partition) off the USB drive, and you should see an oem.tgz file as well as the simple.map file. You need to make a directory structure up for the new oem.tgz file we’ll be creating;
cd ~ mkdir vmtest cd vmtest mkdir -p etc/vmware mkdir -p usr/lib/vmware/vmkmod
Copy the simple.map off the USB drive into etc/vmware directory in our tree structure. eg.
cp /mnt/simple.map ~/vmtest/etc/vmware
And edit the simple.map so that it includes the PCI ids for your Marvell card. Mine is 11ab:4362 so I added in the bolded line below, but yours could likely be different. If you’re not sure, you could boot ESXi off the USB stick again, do the ‘unsupported’ thing to get a prompt and type lspci -v
1166:0410 0000:0000 storage sata_svw.o
1166:0411 0000:0000 storage sata_svw.o
11ab:4362 0000:0000 network sky2.o
14e4:1600 0000:0000 network tg3.o
Now copy in the sky2.o file that we compiled earlier. The modules are in a different directory compared to 3.5 (NB: the compilation process produces two sky2.o files, so make sure you grab the one shown below)
cd ~/vmware-oss/drivers cp ./bora/build/scons/build/vmkdriver-sky2.o/release/vmkernel64/sky2.o ~/vmtest/usr/lib/vmware/vmkmod
Now tar it up, and copy it to the USB stick that should be still mounted;
cd ~/vmtest tar cvzf ~/oem.tgz * cp ../oem.tgz /mnt
Unmount the USB stick
umount /mnt
Now try booting again. Hopefully you should see a ‘loading sky2’ flash up early in the boot … and it should eventually get to the usual ESX status screen showing the current mgmt IP address. Basically if you don’t see the ‘lvmdriver load failed’ then there’s a good chance it’s working.
And yes here is the sky2-for-esxi4-0.01.tar.gz download. It includes the build script, the modified source, plus directory tree for creating the oem.tgz file including a pre-compiled copy of the module. If you can’t be bothered compiling, you can just extract this file, cd to the vmtest directory and create the oem.tgz file as per the earlier notes.
UPDATE: (2010/02/08) There is also now a driver for the Marvell 88E8001 LAN chipset (see the comments discussion below). This uses the skge driver, not the sky2 driver mentioned above. I don’t own a 88E8001, so thank you to samarium for helping out re the unresolved symbols. Please comment below if you’ve tried the skge driver and it works. I have a new tarball sky2-and-skge-for-esxi4-0.02.tar.gz containing both the sky2 and skge driver
UPDATE: (2010/12/24). xieliwei has a driver that should support the 88E8057 on ESXi 4.1 (as per the latest comments, it seems a bit iffy on 4.0). Anyone else who has this chipset, if you could try the driver and post more information that would be great. The 88E8057 code is here as a local copy ; sky2-1.22-for-esxi4-88e8057-r1-xieliwei.tar.gz or on mediafire; sky2-1.22-for-esxi4-88e8057-r1-xieliwei.tar.gz . Also there’s another version of the driver that will produce copious debug information (only of use if you’re having problems); http://www.mediafire.com/file/m836gce3r3b7ygi/sky2-1.22-for-esxi4-88e8057-r1-oem-debug-xieliwei.tgz
Hi there!
Firstly, a very well written blog entry.
I too have a Marvell Yukon and wanted to evaluate ESX/(i)4.
The installation isn’t proceeding beyond the unrelated complaint about not being able to load the lvmdriver.
I’m not installing from CD/DVD or USB but via PXE (obviosuly replacing the root oem.tgz file).
My particular chipset is a 88E8001, so slightly different to yours.
I’d hoped merely changing the ‘simple.map’ file to reflect my vendor/product ID would work (I changed mine from your 11ab:4362 to my 11ab:4320).
Upon reboot I noticed the Sky2.o module being loaded (it flashed by) but then presents me with the same lvmdriver loading error message (I have a 64GB SATA SSD and a 500GB SATA 2.5″ HDs present in a Shuttle SN95G5 (nForce3 chipset)).
Is it possible to use a precompiled Marvell .ko file from somewhere similar to the kernel 2.6 one here; http://www.marvell.com/drivers/driverDisplay.do?driverId=153?
If so; does this need to be compiled against a working install with all of the compile flags as per your source/build script?
Any ideas as I’m loathed to fill the only free PCI slot with an Intel e1000 (as I want to use it for a nice new TV-Tuner for a HTPC)?!
Looking forward eagerly in anticipation of a response.
Cheers,
John
PS: Don’t suppose you know if it’s also possible to dual boot ESX4/(i) and XenServer5.5 (and if so; know a good HOWTO)?
Wow, someone actually reads this! 😉
Get into ‘unsupported’ mode (ie. alt-f1, type ‘unsupported’ and maybe a password) and cd /var/log and do a ;
less sysboot.log
and scroll until you find the messages about loading sky2. Mine just says ;
Module sky2 loaded successfully.
When trying to get the driver to work I tended to see a lot of ‘unresolved symbol’ warnings in the sysboot.log when it attempted to load sky2. Do you get any of these? A possibility with your onboard Marvell is that its referencing different parts of the driver to mine … hence the possibility of unresolved symbols.
If you run a regular linux distro on this host, do you know which network module gets loaded. Is it definitely sky2? or maybe the skge one? (maybe boot a linux ISO and get to a prompt and try lsmod |grep sk to see which one is loaded). I just had a go at getting skge to compile (using the driver from 2.6.26), and it seemed to compile ok. I have no idea if it will work, but you can grab it from here. Of course you’ll need to shove it in your oem.tgz and change the simple.map so that it references skge.o, not sky2.o. Tell me how you get on. (UPDATE: 2010/1/25. Looks like the skge driver obviously does not work. It was a ‘best guess’ compile. But it’s very difficult to diagnose what the problem is without a machine with an skge card in it. As per the comments below, if someone can post what the actual unresolved symbols are, that will be a ‘start’ in working out how to resolve it).
Hi,
I have Marvell 88E8001 card (11ab:4320) and can’t get it workwith ESXi 4.
Tried sky2.o driver… Driver loaded successfuly but card not recognized by ESXi.
Tried skge.o driver… and log says: “vmkload_mod: Can not load module skge: Unresolved symbol”
CentOS 5.4 Live CD loads skge module and card works fine.
There are pretty new drivers ond Marvell site but I don’t know how to compile them to work with ESXi 4.
Regards
Not too sure. Is there anything else listed in the sysboot.log?. Like if you set it up to use just the skge.o driver, is there anything else in the sysboot.log? Often it lists exactly what the unresolved symbols are. Alternatively, if you configure it to boot using just the sky2.o driver instead, what does ‘esxcfg-vmknic -l’ return (in ‘unsupported’ mode)?
Hi Kernel,
I got 3 Marvell 88E8001 (11ab:4320) at home and like to use them for EXSi 3.5 U5. I tried the SKGE.O drivers you provided, I can’t get it work.
It did tried to load the SKGE.O but a lot of unresolved symbols in the /var/log/config.log
did you complied SKGE.O for ESXi 4.0 only? If yes, can you make a version for ESXi 3.5 (U5)? I saw another topic from you is about Marvel Yukon Nic on ESXi 3.5, it seems same as someone post on vm-help.com, I did try the SKY2.O no luck, as they mentioned in the source code of the SKY2.0 driver has no 11ab:4320.
Let me know if you need any more info from me.
Thanks in advance.
Kernel,
I got more info, based on the lspci -vvv output, the NIC was driven by SKGE under SLAX linux live CD. So I believe SKGE.O will be the better bet. I found Marvel’s latest driver for 88E8001 from here:
http://extranet.marvell.com/drivers/files/Linux_10.81.6.3.zip
As I got 2 old PC without 64BIT & VT-x, I can only install ESXi3.5 on them. One of them even has 3C940 on board NIC, seems not support neither.
Looking forward to hear from you, thanks
i’ve been trying to get the latter marvell network controller working with ESXi4 also, and had same conclusions as the previous poster.
according to this debian mailing list, the driver for the 11ab:4320 would be the sk98lin.
http://lists.debian.org/debian-boot/2004/02/msg00230.html
do you know if that driver is available to compile for esxi 4?
Thanks,
~tim
As per an earlier reply, for those with an 88E8001 based card, I need more information about what is happening when esxi is booting. Like I said; Is there anything else listed in the sysboot.log?. Like if you set it up to use just the skge.o driver, is there anything else in the sysboot.log? Often it lists exactly what the unresolved symbols are (basically if someone can tell me what the actual unresolved symbols are I can have another go at compiling it with these missing symbols compiled in). Without an actual 88E8001 card I can only really guess.
ESX4i U1 (build 208167) on an ASUS P5Q-e.
$ lspci | awk ‘/AHCI|Gigabit/’
00:1f.2 SATA controller: Intel Corporation 82801JI (ICH10 Family) SATA AHCI Controller
02:00.0 Ethernet controller: Marvell Technology Group Ltd. 88E8056 PCI-E Gigabit Ethernet Controller (rev 12)
07:02.0 Ethernet controller: Marvell Technology Group Ltd. 88E8001 Gigabit Ethernet Controller (rev 14)
$ lspci -n|egrep `lspci | awk ‘/AHCI|Gigabit/{l=l s $1; s=”|”}END{print l}’`
00:1f.2 0106: 8086:3a22
02:00.0 0200: 11ab:4364 (rev 12)
07:02.0 0200: 11ab:4320 (rev 14)
ACHI driver for ICH10R supported out of the box.
sky2 driver provided by KernelCrash works in oem.tgz but with 11ab:4364 id, thanks.
skge driver failed to load. skge is what normally runs on this box under ubuntu, so not sure what previous commenter was talking about sk98lin? Maybe previous a driver for this card.
sysboot.log output from system trying to load skge:
vmkload_mod: Can not load module skge: Unresolved symbol
[2010-02-06 08:55:29 ‘VmkCtl’ warning] Loading module skge.o failed. Exec of command ‘/sbin/vmkload_mod skge ‘ succeeded, but returned with non-zero status: 1
messages output from system trying to load skge:
Feb 6 08:55:29 vmkernel: 0:00:00:16.456 cpu2:4717)Loading module skge …
Feb 6 08:55:29 vmkernel: 0:00:00:16.456 cpu2:4717)Elf: 2320: symbols tagged as
Feb 6 08:55:29 vmkernel: 0:00:00:16.464 cpu2:4717)WARNING: Elf: 1570: Relocation of symbol failed: Unresolved symbol
Feb 6 08:55:29 vmkernel: 0:00:00:16.471 cpu2:4717)ALERT: Elf: 2518: Kernel module skge was loaded, but has no signature attached
Feb 6 08:55:29 vmkernel: 0:00:00:16.471 cpu2:4717)WARNING: Elf: 2542: Kernel based module load of skge failed: Unresolved symbol
so it looks like skb_pad is the culprit.
Hey thanks for that. I guess the skb_pad bit is off the end of the lines you pasted in. I’ve found the skb_pad routine out of the same kernel I used for the skge driver (2.6.26) and shoved it into the driver source code, and recompiled. I’ve uploaded this new test version of the skge driver. Can you tell me if it works .. or perhaps gets other symbol errors?
I’ll have a look later today.
Looking at the comment I left compared to what I pasted, I thnk the line got chopped by wordpress, because what it said on the Elf: 1570 line between symbol and failed was
lessthan skb_pad greaterthan
so I guess wordpad in interpreting it as a unknown html token.
Now skge loads, and works in so far as I can ping test successfully. Thanks.
Not sure if it will be useful to me, but since you went to the trouble of building it, I thought it would be nice to get it tested.
If I get keen, I’ll build up my own build environment, and maybe take a crack at getting the dual port PCI-E Silicon Image 3132 adapters I have working under ESXi, but too many other projects on the go for it to happen soon.
Thanks again for the NIC drivers.
Hey, thats great news. It’d be good to see whether it still works OK under some load. I might update the main post re the skge driver working now and encourage people to test it more thoroughly and comment. Thanks again.
kernel,
I’m trying to create and add drivers to a ESXi 4 install. Have any tips: http://www.vm-help.com/forum/viewtopic.php?f=12&t=2002
Thanks!
I don’t know much about that Broadcom card. If I was trying to get it to go, I’d work out whether there is an existing linux kernel driver for that Broadcom chipset, and use that rather than a linux driver off the broadcom site. The rationale is that a built-in linux kernel network driver is probably going to be more like the sky2 or skge drivers that I’ve modified OR some of the drivers that vmware themselves supply the source for (such as the forcedeth one). You can easily see the changes I made to say the sky2 linux driver by just downloading the linux kernel source for 2.6.26, pull out the sky2.c file and diff it against the one that I’ve included as a download here (or the skge.c one). I didn’t end up making many changes. Just have a go at trying to get your driver to compile. Once it compiles OK, then you might find some unresolved symbols popping up in the esxi boot logs, fix all those and see if it works.
Hi
Great site I have tested on my whitebox Esxi 4 U1 with Dlink DGE-530T (Marvel Yukon 88E8001). Anyway this particular NIC uses pci ids 1186:4b01, just added that into the simple.map and reboot. The Esxi detected successfully. I will test it further to see if I can NIC teaming since I have two of them.
Thank for the driver.
Hi again,
I just tested Lagg on my Dell Gigabit Switch with Dlink DGE-530T revb1 (Rev11) so far everything is running perfectly fine.
Thanks for testing the skge driver out. Greatly appreciated.
Tested the driver on my P5Q Deluxe motherboard, driver is working great on only one of the network adapter, need to make sure that the other network adapter has not been disabled in the bios. Will update later on. Thanks for a great effort.
all is working well, as i had the wrong id in the simple.map file. ESXi 4 recognises all the network adapters now. Thanks for the wonderful effort you put into these guys.
First, thank you for the wonderful insight. This has been helpful in ttying to compile a driver for the tulip.o based network cards. I think I have managed to resolve all warnings and compilation errors and can successfully generate a tulip.o module, but… I get a similar error to ‘samarium’ and was wondering if you have any insight.
Questions:
Does anyone have any idea how to compile a module to include signatures or symbols?
Does anyone know what a ‘ElfRelocateFile global’ is?
Darn… so close…
ASUS M2A/VM Motherboard
ahci.o for ATi RAID
r8169.o for RealTek 8168 NIC (Nov 3 2009)
aic7xxx.o for Adaptec SCSI (Nov 8 2009)
Installed Adaptec Quartet 4 port PCI NIC that uses the DEC 21142/43 chipset, which requires tulip.o module. Source for tulip.o retrieved from latest Linux kernel source linux-2.6.33.tar.
All compilation was done with CentOS 5.4 x64 installed on external USB drive for easy removal. The hope is to use the system as a ESXi Whitebox PC.
I get the following errors when trying to load tulip.o that I compiled.
— /usr/log/sysboot.log —
vmkload_mod: Can not load module tulip: Unresolved symbol
‘VmkCtl’ warning Loading module tulip.o failed. Exec of command ‘/sbin/vmkload_mod tulip ‘ succeeded, but returned with non-zero status: 1
— /usr/loc/messages —
cpu1:13645 Elf: 2320: symbols tagged as
cpu1:13645 ALERT: Elf: 2518: Kernel module tulip was loaded, but has no signature attached
cpu1:13645 WARNING: Elf:2542 Kernal based module load of tulip failed: Unresolved symbol ‘ElfRelocateFile global failed’
2F049B90 info ‘ha-eventmgr’ Event 9 : Issue detected on localhost.domain.com in ha-datacenter: Elf: 2518: Kernal module tulip was loaded, but has no signature attached
Any help or insight is more than appreciated. It would be nice to be able to compile a stable tulip.o driver since is it not yet available.
Hi grittyKitty. No I don’t know what ElfRelocateFile is in relation to your problem. Usually when I get those unresolved symbols, I search through a real linux kernel tree to see where the function is defined. ie. something like;
cd /usr/src/linux
grep -r skb_pad *
And then wade through the output trying to find the function definition. I had a look through the 2.6.26 kernel I often use as a base, and I couldn’t find ElfRelocateFile. It looks more like some kind of linker type thing, rather than a kernel function (but I could be way off).
But, anyway I had a go myself at trying to compile your driver (using the tulip stuff out of a 2.6.26 kernel). Firstly, I’m not 100% sure what module or modules get loaded by a regular linux kernel for your card (is it just de4x5, or de4x5 and tulip or something else entirely?). If you do know, please tell me. I took a guess and thought it just loads the de4x5 module for your card (which is part of the tulip stuff). There’s a lot more to this tulip driver stuff than some of the other types of network cards, so my effort is very much a ‘1st attempt’. It compiles. I have a module. If you can try and see if it gets any further than yours great, or have a look at the source mods I made to see if there’s anything there that’s helpful to you and see how you go. So grab my de4x5-0.01.tar.gz and see if it helps. You’ll still need to put an entry in your simple.map. I am thinking that the simple.map line will be ‘blah blah network de4x5.o’ or similar.
One other thing. Is your card a 100Mbps card?
Wow. Impressive response. Thanx for the enthusiasm and extended effort. Seriously!
The Adaptec 4-port card is based on the DECchip 21142/43 chipset. It is PCI-based 32bit and, yes, it is 100MBps. Slightly older card, but still quite useful in Win2K3 systems. Was hoping to utilize the card instead of purchasing an expensive 4-port PCI-E gigabit card. Motherboard is an ASUS M2A-VM motherboard that utilizes an ATi SB600 RAID controller, AMD Athlon X2 64 6000+ with 8GB OCZ SLI DDR2-800.
This is the original information I grabbed when initially booting with Ubuntu to help identify the card, plus a couple notes I made to get me started.
04:04.0 Ethernet controller: Digital Equipment Corporation DECchip 21142/43 (rev 41)
Class 0200: 1011:0019 (rev41)
Chip Number: DC21142/3
Chip Description: PCI/CardBus 10/100 Mbit Ethernet Ctlr
Notes: SALVADOR
Module: tulip (http://hardware4linux.info/component/23459/)
tlp — DECchip 21x4x and clone Ethernet interfaces device driver
This is what CentOS reports…
04:04.0 Ethernet controller: Digital Equipment Corporation DECchip 21142/43 (rev 41)
Subsystem: D-Link System Inc Unknown device 1110
Flags: bus master, medium devsel, latency 64, IRQ 233
I/O ports at ac00 [size=128]
Memory at fddff000 (32-bit, non-prefetchable) [size=1K]
Expansion ROM at fdd80000 [disabled] [size=256K]
04:05.0 Ethernet controller: Digital Equipment Corporation DECchip 21142/43 (rev 41)
Subsystem: D-Link System Inc Unknown device 1110
Flags: bus master, medium devsel, latency 64, IRQ 225
I/O ports at a800 [size=128]
Memory at fddfe000 (32-bit, non-prefetchable) [size=1K]
Expansion ROM at fdc00000 [disabled] [size=256K]
04:06.0 Ethernet controller: Digital Equipment Corporation DECchip 21142/43 (rev 41)
Subsystem: D-Link System Inc Unknown device 1110
Flags: bus master, medium devsel, latency 64, IRQ 217
I/O ports at a400 [size=128]
Memory at fddfd000 (32-bit, non-prefetchable) [size=1K]
Expansion ROM at fdc40000 [disabled] [size=256K]
04:07.0 Ethernet controller: Digital Equipment Corporation DECchip 21142/43 (rev 41)
Subsystem: D-Link System Inc Unknown device 1110
Flags: bus master, medium devsel, latency 64, IRQ 58
I/O ports at a000 [size=128]
Memory at fddfc000 (32-bit, non-prefetchable) [size=1K]
Expansion ROM at fdc80000 [disabled] [size=256K]
To remove my initial compile errors and warnings I had to modify the netdevice.h file to include net_device_ops structure and a couple other changes not included with ESXi’s source, related to bit padding (noticed that you used a common.h file, I used le_byteshift.h stolen from linux-2.6.33). I have included my changes that I used to compile my tulip, http://www.asynccomputing.com/files/tulip-attempt-01.tgz.
I used linux-2.6.33.tar.bz2, and VMware-esx-public-source-4.0-208249.tar.gz for my source and followed your instructions above for setting up a CentOS 5.4 x64 system. P.S. I did have to RPM the ESXi kernel source to get anything to compile properly as you mentioned in your notes at the beginning of the thread.
Interesting. I’m running ESXi 4.0 Build 208157, yet the source is marked as Build 208249?!? Curious how your de4x5.o is 139kb and my tulip.o is 5609kb. Makes me wonder if I’m actually compiling this correctly.
I compared our tulip files and they are slightly different (minus the code adjustments we made). I tried de4x5.o and got a kernel crash, or at least I think it’s a kernel crash, i.e., big nasty purple screen with text. Let me know if you want me to type out the error information from the screen. There was no room on the disk to grab a data dump since I’m using ESXi on a USB for testing purposes until I get it right — then I’ll burn an install CD when testing is complete.
P.S. ESX Server 2.0.2 appears to work fine under CentOS 5.4 x64. Alternative solution =) … and some user’s claim that it is better on resources, but slightly slower. The debate is open on this topic. Another time.
I’ll check out what you mentioned about the ‘ElfRelocateFile’ and see what else I can find. Google prefers searching for ‘Elf Relocate File’, but doesn’t yield anything useful.
Thank you!
What fun!
Hi,
I’m having the same problem. And I have the same motherboard. And I looked it up, my PCI ids are the same, too.
But I only have a Windows PC and I would like to avoid putting too much stuff like VMs on this PC.
I couldn’t get my USB stick working:
I tried to extract the contents of the .iso image for my esxi on the stick and made it bootable with syslinux. But it didn’t work.
Then I tried to put the created oem.tgz on the CD where I burned the .iso image earlier. But There was no change. It still said, that it does not have any network driver.
But perhaps I mixed something up, and you did the oem.tgz file onto an other partition of your usb device where you extracted the dd file to?
Because I didn’t understand the part with the oem.tgz at all 🙁
What’s the easiest way to get it working? Is there a possibility to put it into the .iso image or do I have to take a different one?
Good evening,
I compiled your version of the tulip source, but opted to try and create a tulip.o module instead of the de4x5.o. There were a couple warnings, but the module was created. The tulip.o file was, again, around 5MB in size.
I rebooted with the new tulip.o module and again, received the nasty PSOD (purple screen of death). Lots of memery addresses and registers from the looks of it.
I then restored my tulip source, recompiled and rebooted and was able to return back to the same spot I originally mentioned, i.e., Unresolved symbol, module tulip.o failed, /sbin/vmkload_mod tulip suceeded, but returned with non-zero status.
I wanted to make sure I could recompile the tulip source I was working on and had not inadvertently buggered something that was causing the PSOD.
I’ll review further on the weekend when I have more time. Thanx for listening.
darkdragon, I’ve never tried trying to recreate an the iso. It always seemed easier to play around with a bootable USB stick since you just need to dd the appropriate file onto the usb stick (and you don’t need to do anything with syslinux). I guess I’ve done everything using a linux host since that seemed easier from my perspective. One thing about the file you dd onto the usb stick is that it includes partition mappings etc … which means you end up with that partition 5 that you can just mount and update the oem.tgz in. My guess is that there is a rather different technique if you want to recreate the iso, since iso’s don’t generally have partition mappings. I guess this isn’t much help, but I’d probably recommend getting some kind of small VM running on your windows box, that is running some linux, that you could pass through the USB stick to, dd stuff to, mount etc .
grittyKitty, I think I was wrong about the de4x5 thing, and your trying to get it running as tulip is probably the right way to go. However I’m not sure I understand the interaction between the two. If you do an lsmod on your ubuntu running on the box with the quad card, do you see both the tulip module and the de4x5 module loaded? or just the tulip module? I notice in your source that you don’t compile de4x5.c. Actually I’m a bit confused about de4x5.c. Is it some leftover from an earlier version of the driver and is not used anymore in linux? There’s a whole heap of text at the start of de4x5.c sort of implying it could be used independently and that it is aimed at the 21142/43 chipset. It makes me wonder whether if you just created a build script for de4x5 and vmklinux_module bit and linked them whether it would work. I just did. Interestingly that does compile with no errors. I’ll have more of a look over the weekend
Thank you for your reply. Then I’ll try it with the USB-stick, too. But I didn’t get it working…
Why do I have to do exactly, when I don’t do this compiling stuff.
I downloaded the source code, but I think I don’t need this if I would compile it myself.
I tried to dd the dd-file from the iso-image (image.tgz | I took it from the “normal” downloaded esxi – not source) to my usb-stick. my ubuntu did it without errors. But when I tried to access the drive, it gave me strange file-names. When I tried to boot from the stick, it didn’t work.
What am I doing wrong.
btw: Is it possible to get E-Mail notifications when somebody else posts a reply?
Yeah, not too sure what’s going wrong for you. There are plenty of examples on the net of how to get esxi onto a usb stick. I just googled for ‘esxi usb stick dd’ and there are plenty of useful results.
darkdragon, I just pulled the dd file from the iso and then used WinImage on a WinXP system to write it to a blank 4GB USB stick. Worked like a charm. Used this link… http://www.techhead.co.uk/how-to-create-a-bootable-vmware-esxi-usb-pen-drive.
p.s. WinImage did not work on Windows 7, so I had to run it on a WinXP VM
kernel, I hae to check, but I’m pretty sure tulip module was loaded when using ubuntu, but need to verify before commiting… I was also wondering about compiling the de4x5 code using the tulip files I extracted from the linux-2-6-33 files.
I’ll play more with this on Sunday, when I have more time… thank you for the extra pair of eyes
Hello, I’m trying to compile sunhme module for my quad ethernet 10/100 card.
Unfortunately, using your solution, when I try to load module on ESX, it says :
May 1 10:35:19 node02 vmkernel: 0:00:47:44.486 cpu2:4104)Loading module sunhme …
May 1 10:35:19 node02 vmkernel: 0:00:47:44.487 cpu2:4104)WARNING: Elf: 1062: found undefined __this_module symbol
May 1 10:35:19 node02 vmkernel: 0:00:47:44.487 cpu2:4104)Elf: 2320: <sunhme> symbols tagged as <GPL>
May 1 10:35:19 node02 vmkernel: 0:00:47:44.493 cpu1:4104)WARNING: Elf: 1570: Relocation of symbol <sunhme_HeapID> failed: Unresolved symbol
May 1 10:35:19 node02 vmkernel: 0:00:47:44.493 cpu1:4104)WARNING: Elf: 1570: Relocation of symbol <sunhme_SkbHeapID> failed: Unresolved symbol
May 1 10:35:19 node02 vmkernel: 0:00:47:44.493 cpu1:4104)WARNING: Elf: 1570: Relocation of symbol <pci_map_rom> failed: Unresolved symbol
May 1 10:35:19 node02 vmkernel: 0:00:47:44.493 cpu1:4104)WARNING: Elf: 1570: Relocation of symbol <pci_unmap_rom> failed: Unresolved symbol
May 1 10:35:19 node02 vmkernel: 0:00:47:44.493 cpu1:4104)WARNING: Elf: 1570: Relocation of symbol <pci_unmap_rom> failed: Unresolved symbol
May 1 10:35:19 node02 vmkernel: 0:00:47:44.493 cpu1:4104)WARNING: Elf: 1570: Relocation of symbol <sunhme_HeapID> failed: Unresolved symbol
May 1 10:35:19 node02 vmkernel: 0:00:47:44.498 cpu1:4104)ALERT: Elf: 2518: Kernel module sunhme was loaded, but has no signature attached
May 1 10:35:19 node02 vmkernel: 0:00:47:44.498 cpu1:4104)WARNING: Elf: 2542: Kernel based module load of sunhme failed: Unresolved symbol <ElfRelocateFile global failed>
Do you have any idea of the problem ?
Best regards.
I read through your previous blog-post with the 3.5 version. Perhaps you should mention in this post, that you should read the first one, too.
I finally got the dd to my USB-stick. I didn’t set the parameter bs=32k last time. Where should I know it from?
But then it didn’t boot from it… But there was no error, too. How long did it take for you?
But it worked using plpbt for my vmware testing machine. Perhaps my computer has any problems booting from a USB…
I used the dd for windows. I went through many articles, but it either gave me an error, or it just stopped after a time (the progress number didn’t go up. I waited over half an hour). I formatted the stick with diskpart, too. Even if I don’t think this was useful. I started dd as admin and the command prompt as admin, too. I’m trying WinImage later (Thank you for the tip!). Perhaps it workes on Win7 in compatibility mode…
I was trying some other tools, too. Perhaps the problem is at the server. Because it didn’t work to directly boot from it (see further on the top). Perhaps I’m trying this with the plpbt trick, too later on.
I’m now trying to create a iso-image.
For this, I first took an dd-image from my USB stick again. Perhaps it’s better to use a virtual HardDisk with exactly 900MB than the resulting file is smaller. then I built the image.tgz (with the same directory tree and so on). Then I used ISO Master to replace the image.tgz file. Now I’m burning it on a CD. I’ll tell you later if it worked or not…
Winimage does the same as dd. it just stops (at 70% and then it does not get more (waited up to 15 min)) ( – and it’s not freeware).
Well I tried out the iso thing, and it didn’t work. (No network adapter found). Reason: see next step!
I tried out to boot from usb with help of plpbt. And it loaded, but it still tells me: “No compatible network adapter found.”
What am I doing wrong? I dd’d it to the USB-stick, and then replaced the oem.tgz file with the network-drivers (sky2.o).
perhaps my oem.tgz file is wrong…
I built it the following way: I put all contents of vmtest (etc and usr) and created a tar archive. I gzip’d it and then renamed it (using 7-zip in windows). So when I’m opening that file, there are the two directories (etc and usr) in there. Is this right?
Or do I have to put the sky2.o directly in that archive?
Now everything worked. I could manage it from the vSphere client.
But when I restarted it, there was nothing on the HDD.
So do you need to install it with the iso-image?
Hey that sounds like you have it working if you can talk to it using the vSphere client. The thing with the USB stick method is that you always boot off the USB stick and just use your real hard drives as storage. Having ESXi always on the USB stick is not such a bad thing as it’s relatively small and loads mostly into memory anyway (ie. there’s generally not much write activity back to the USB stick which would be relatively slow)
torrmkr, I had a quick go at a ‘first attempt’ compile of a sunhme out of a 2.6.26 kernel. I must admit I’m not too sure about some of the unresolved symbols you’re getting, namely the sunhme_HeapID, sunhme_SkbHeapID and the ElfRelocateFile (grittyKitty has highlighted these ElfRelocateFile issues as well). But the pci_map_rom and pci_unmap_rom ones are reasonably explainable. These routines are normally in pci.c in a linux kernel, but ESX doesn’t have them. We can either copy the ones from from the pci.c in a real linux kernel, or work out some other option. For now (since this is a first attempt), I’ve effectively commented the calls to pci_map_rom and pci_unmap_rom out. They’re only used to work out the MAC addresses of the interfaces. There’s an alternate piece of code in sunhme.c which simply uses a random number for the MAC address if if has a problem reading the ethernet card’s rom. That’s ‘good enough’ until we actually get the driver working. Anyway, I’ve uploaded sunhme-0.01.tar.gz . Like I said it’s a ‘first attempt’ and I suspect it looks very similar to your code changes. Tell me if it works any better or worse.
Hello, the module has been loaded succesfully.
Unfortunately, although I’ve added right PCI IDs, it doesn’t find any interface (driver hme claimed 0 device)
This is the output of /var/log/vmkernel
————————————–
May 2 11:18:15 node02 vmkernel: 0:00:21:38.301 cpu2:4106)sunhme loaded successfully.
May 2 11:18:15 node02 vmkernel: 0:00:21:38.301 cpu2:4106)ALERT: Elf: 2518: Kernel module sunhme was loaded, but has no signature attached
May 2 11:20:59 node02 vmkernel: 0:00:24:22.747 cpu2:4106)Loading module sunhme …
May 2 11:20:59 node02 vmkernel: 0:00:24:22.748 cpu2:4106)Elf: 2320: <sunhme> symbols tagged as <GPL>
May 2 11:20:59 node02 vmkernel: 0:00:24:22.758 cpu1:4106)module heap : Initial heap size : 102400, max heap size: 4194304
May 2 11:20:59 node02 vmkernel: 0:00:24:22.758 cpu1:4106)module heap sunhme: creation succeeded. id = 0x4100bbc00000
May 2 11:20:59 node02 vmkernel: 0:00:24:22.758 cpu1:4106)module skb heap : Initial heap size : 524288, max heap size: 23068672
May 2 11:20:59 node02 vmkernel: 0:00:24:22.759 cpu1:4106)module skb heap : creation succeeded
May 2 11:20:59 node02 vmkernel: 0:00:24:22.759 cpu1:4106)PCI: driver hme is looking for devices
May 2 11:20:59 node02 vmkernel: 0:00:24:22.759 cpu1:4106)PCI: Trying 0000:00:02.0
May 2 11:20:59 node02 vmkernel: 0:00:24:22.759 cpu1:4106)PCI: Trying 0000:00:02.1
May 2 11:20:59 node02 vmkernel: 0:00:24:22.759 cpu1:4106)PCI: Trying 0000:00:04.0
May 2 11:20:59 node02 vmkernel: 0:00:24:22.759 cpu1:4106)PCI: Trying 0000:00:04.1
May 2 11:20:59 node02 vmkernel: 0:00:24:22.759 cpu1:4106)PCI: Trying 0000:00:06.0
May 2 11:20:59 node02 vmkernel: 0:00:24:22.759 cpu1:4106)PCI: Trying 0000:00:0a.0
May 2 11:20:59 node02 vmkernel: 0:00:24:22.759 cpu1:4106)PCI: driver hme claimed 0 device
May 2 11:20:59 node02 vmkernel: 0:00:24:22.759 cpu1:4106)Mod: 2986: Initialization for sunhme succeeded with module ID 57.
May 2 11:20:59 node02 vmkernel: 0:00:24:22.759 cpu1:4106)sunhme loaded successfully.
————————————–
After some retries, I decided to create a new file called /etc/vmware/pciid/hme.xml that contains the new PCI addresses :
<?xml version=’1.0′ encoding=’iso-8859-1′?>
<pcitable>
<vendor id=”108e”>
<short>Sun</short>
<name>Sun Microsystems Computer Corp.</name>
<device id=”1001″>
<vmware label=”nic”>
<driver>sunhme</driver>
</vmware>
<name>Sun Happy Meal</name>
<table file=”pcitable” module=”ignore” />
<table file=”pcitable.Linux” module=”sunhme”>
<desc>Sun Microsystems Computer Corp.|Sun Happy Meal</desc>
</table>
</device>
<device id=”1000″>
<vmware label=”nic”>
<driver>sunhme</driver>
</vmware>
<name>Sun Happy Meal</name>
<table file=”pcitable” module=”ignore” />
<table file=”pcitable.Linux” module=”sunhme”>
<desc>Sun Microsystems Computer Corp.|Sun Happy Meal</desc>
</table>
</device>
</vendor>
</pcitable>
Then I recalled the executable esxcfg-pciid that recreated correct simple.map (I’m under ESX 4 and I didn’t create an oem.tgz).
Now it works correctly, it sees all the NICs and they have been added to vSwitch to simulate NIC TEAMING. So, the driver works !!!
The only problem I have is that on reboot I must manually run the command “vmkload_mod sunhme” to load the module. Is there a way to make this automatically ?
Best regards.
I solved this issue by editing /etc/rc.local file. Unfortunately the esxcfg-module didn’t work as it should be (enabling module on boot).
[root@node02 ~]# cat /etc/rc.local
#!/bin/sh
#
# This script will be executed *after* all the other init scripts.
# You can put your own initialization stuff in here if you don’t
# want to do the full Sys V style init stuff.
vmkload_mod sunhme
touch /var/lock/subsys/local
[root@node02 ~]#
Best regards.
torrmkr, that sounds like you have it working. I must admit I don’t quite understand the problem you’re seeing re not detecting the sunhme devices, and the use of that xml file and esxcfg-pciid to resolve it, but if it works for you that’s great (I’ve only ever used the oem.tgz and simple.map on ESXi 4 per my post). I was thinking a bit more about how to avoid the random MAC addresses, and a simple cheap workaround might be to change the sunhme.c code from;
to something like;
Still not the greatest solution, but at least you know what the MAC addresses will be.
Hello Kernel,
Just an update. I had to give up on ESXi 4 and trying to compile the tulip.o modules.
Ultimately switched to using VMware Server 2.0.2 on CentOS 5.4 x64. Everything configured fine since CentOS includes the tulip.o modules.
Ran VMware Server 2.0.2 for the past month and I’d have to give it a 6/10. The web interface continually dies every time I try to update, create new VM’s and I must constantly restart the web host. VMware opted for a web-based GUI to manage VM’s and I’d have to agree with a lot of the blogs out there, quite frankly, it sucks and renders the product semi-usable. Also, sometimes my VM’s will just blink off after a few days with no rhyme or reason. This is a development box, but some stability would be nice.
I’m now switching to using Win2K3 Server x64 Standard with VMware Server 2.0.2. I’ll report back after this experience.
Again, thanx for your input.
Best…
grittykitty@kissthesquirrel.com
Thanks for the update grittyKitty. That’s too bad re the tulip driver. Some of these drivers seem to be harder than others to get ‘working’.
Worked for me. Using Marvell Yukon 88E8071 (identified by lspci as Galileo Tech Ltd 88E8071 network adapter, venID:prodID 11ab:436b) – using sky2 driver.
Machine, for reference, is Acer Aspire M3300 Athlon 620 x4
Awesome! Million thanks.
^^^ never mind… The 88E8071 works at the beginning, ping is good and I can load the HTTP homepage but then it just hangs all nw traffic, no ping reply no restart no nothing. Oh well, it was worth the try.
That’s no good. I was thinking that it was a newer Marvell chipset, but it looks like an older one (it’s referred to in the older sky2.c sources). I guess check the various ESXi log files to see if there is any debug messages about what’s happening. Also try ‘esxcfg-nics -l’ to check the interface status to see if that has any useful info.
Thanks for reply.
Logs show no sign of failure on network side. The esxcfg command shows link is up & normal, sky2 driver loaded etc. A patch might be needed as described here http://www.mail-archive.com/netdev@vger.kernel.org/msg39532.html but not sure if that would work.
Another thing that might be wrong – I am trying your binaries on ESXI 4.0 Update 1 – maybe some kernel incompatibility? I just had the server machine freeze on me when trying to download the vSphere over web interface. I’m pondering whether it is cheaper to just buy a NIC.
shadowncs, I had a look through the patch on that http://www.mail-archive.com/netdev@vger.kernel.org/msg39532.html link, and compared a lot of it against the driver code in the sky2.c and sky2.h source I was using. The source I was using seems like it already had something similar applied anyway. But, I did have a go at manually applying that patch. I’ve compiled it and uploaded it as sky2-for-esxi4-88e8071-test1.tar.gz. Feel free to try it and see if it works any better. I have low expectations though.
Hello Kernel,
I have some problem whith driver. I’m write all here http://www.vm-help.com/forum/viewtopic.php?f=12&t=2421&start=0. May be you help me?
Best regards.
I’ll have a look if I get some time (am a bit busy lately). I guess I would first try a regular linux distro on this system, using the same kernel version that you’re using as a base for your esxi driver, and confirm that the rhine driver can detect and work with that card.