Menu

Virtual Geek

Tales from real IT system administrators world and non-production environment

Microsoft Powershell: Download a whole folder of files/subfolders from the web directory

One of my friend was seeking my help creating a script to download bulk files and folder from internal office training web portal, just newly created. Folders and files web structure was looking like below. The first url link shows what are the different training material was available, downloading one file at time was going to take long time. Once I click on any of the required link, It has the PPTs, video files and folders stored into, further directories.

microsoft windows powershell, free training office web portal files and folders download script, web api, invoke-webrequest

To work this script I need the link URL on the second browser window, Once ps1 script is executed, It asks for main two Parameters, First is Downloadurl and second  is DownloadToFolder. Once url is validated it will start downloading the file and shows nice tree view, If any of the file is not downloadable or errors into 404, it will give me message in red. 

Microsoft powershell windows start-dirdownload, automate download with powershell, free download training material, ps1 files invoke-webrequest download

All the info and main cmdlet of the script is Invoke-WebRequest, Which fetch information from web site. Once script is execution is complete, all files are downloaded, you can view the download folder, I further drilled down folders and viewed, files they are there.

microsoft powershell downloaded content invoke-webrequest, download files and folders from web, microsoft windows powershell, powershell downloader, free trainings

Download this script here, it is also available on github.com.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
<#
#requires -version 3
.SYNOPSIS
    Downlaod all files and folder from IIS directory website.
.DESCRIPTION
    The Start-DirDownload cmdlet downloads complete directory and files from web. 
.PARAMETER Downloadurl
    Prompts you for download url
.PARAMETER DownloadToFolder
    Prompts where you want to download files and folder from IIS web, DownloadPath is alias
.INPUTS
    No Input
.OUTPUTS
    Output is on console directly.
.NOTES
  Version:        2.0
  Author:         Kunal Udapi
  Creation Date:  12 February 2017
  Purpose/Change: Download automated way to download files from net (http://kunaludapi.blogspot.in)
  Useful URLs:    http://vcloud-lab.com
.EXAMPLE 1
    PS C:\>Start-DirDownload -Downloadurl http://freetrainings001.com/trainingPortal/AzureAdvanced -DownloadToFolder C:\Temp

    This command start download files from given url and downloads to given folderpath.
#>

[CmdletBinding(SupportsShouldProcess=$True,
    ConfirmImpact='Medium',
    HelpURI='http://vcloud-lab.com')]
Param
(
    [parameter(Position=0, Mandatory=$true, ValueFromPipelineByPropertyName=$true)]
    [String]$Downloadurl = 'http://freetrainings001.com/Office%20Learning%20Portal/Powershell%20Advanced%20Training/',
    [parameter(Position=1, Mandatory=$true,ValueFromPipelineByPropertyName=$true)]
    [alias('DownloadPath')]
    [String]$DownloadToFolder = 'C:\Temp\test'
)
process {
    try {
        if (!(Test-Path -Path $DownloadToFolder)) {
            New-Item -Path $DownloadToFolder -Type Directory -Force -ErrorAction Stop | Out-Null
        }
        $CMDBrowser = Invoke-WebRequest $downloadurl -ErrorAction Stop
        $AllLinks = $CMDBrowser.links | Where-Object {$_.innerHTML -ne "[To Parent Directory]" -and $_.innerHTML -ne 'web.config'} #| Select -Skip 23
        foreach ($link in $AllLinks) {
            $FolderName = $link.innerText
            $DownloadPath = Join-Path $DownloadToFolder $FolderName
            if (!(Test-Path $DownloadPath)) {
                New-Item -Path $DownloadPath -ItemType Directory | Out-Null
            }
            $RawWebsite = $Downloadurl -split '/'
            $WebSite = $RawWebsite[0,2] -join '//'
            Write-Host $FolderName -BackgroundColor DarkGreen
            $FolderUrl = "{0}{1}" -f $WebSite, $link.href #$Downloadurl Replacedby $WebSite
            $FolderLinks = Invoke-WebRequest $FolderUrl 
            if ($FolderLinks.StatusCode -eq 200) {
                $FilesLinks = $FolderLinks.Links | Where-Object {$_.innerHTML -ne '[To Parent Directory]' -and $_.innerHTML -ne 'web.config'} #| Select-Object -Skip 5
                foreach ($File in $FilesLinks) {
                    $FileUrl = "{0}{1}" -f $WebSite, $File.href
                    $FilePath = "{0}\{1}" -f $DownloadPath, $File.innerText
                    try {
                        if (!(Test-Path -Path $FilePath)) {
                            Invoke-WebRequest -Uri $FileUrl -OutFile $FilePath
                            Write-Host "`t|---$($File.innerText)"
                        }
                        else {
                            Write-Host "`t|---$($File.innerText) --Already exist" -ForegroundColor DarkYellow
                        }
                    }
                    catch {
                        Write-Host "`t|---$FilePath --skipped" -BackgroundColor DarkRed
                    }
                }
            }
        }
    }
    catch {
        Write-Host $error[0]
    }
}

Useful Scripts
Different ways to bypass Powershell execution policy :.ps1 cannot be loaded because running scripts is disabled
INSTALLING .NET 3.5 FRAMEWORKS ON WINDOWS SERVER 2012 R2
CONVERTING FROM SERVER 2012 CORE INSTALL TO FULL GUI
Installing, importing and using any module in powershell
How to Install and Use Microsoft PowerShell on Linux

Go Back



Comment

Blog Search

Page Views

11240379

Follow me on Blogarama