subject

You have been tasked with building a URL file validator for a web crawler. A web crawler is an application that fetches a web page, extracts the URLs present in that page, and then recursively fetches new pages using the extracted URLs. The end goal of a web crawler is to collect text data, images, or other resources present in order to validate resource URLs or hyperlinks on a page. URL validators can be useful to validate if the extracted URL is a valid resource to fetch. In this scenario, you will build a URL validator that checks for supported protocols and file types. What you need to do?
1. Writing detailed comments and docstrings
2. Organizing and structuring code for readability
3. URL = :///
Steps for Completion
Task
Create two lists of strings - one list for Protocol called valid_protocols, and one list for storing File extension called valid_ftleinfo . For this take the protocol list should be restricted to http , https and ftp. The file extension list should be hrl. and docx CSV.
Split an input named url, and then use the first element to see whether the protocol of the URL is in valid_protocols. Similarly, check whether the URL contains a valid file_info.
Task
Write the conditions to return a Boolean value of True if the URL is valid, and False if either the Protocol or the File extension is not valid.
main. py Ñ… +
1 def validate_url(url):
2 Validates the given url passed as string.
3
4 Arguments:
5 url --- String, A valid url should be of form :///
6
7 Protocol = [http, https, ftp]
8 Hostname = string
9 Fileinfo = [.html, .csv, .docx]
10 ***
11 # your code starts here.
12
13
14
15 return # return True if url is valid else False
16
17
18 if
19 name _main__': url input("Enter an Url: ")
20 print(validate_url(url))
21
22
23
24
25

ansver
Answers: 3

Another question on Computers and Technology

question
Computers and Technology, 22.06.2019 08:10
Technician a says that if a valve is open when a piston rises to the very top of a cylinder, the piston may actually strike the valve head and cause serious engine damage. technician b says if the camshaft is located in the engine block, then the engine is called an overhead valve engine, ohv engine, or an in-block camshaft. who is right? a. b only b. both a and b c. a only d. neither a nor b
Answers: 3
question
Computers and Technology, 23.06.2019 01:20
Me with this program in c++ ! computers represent color by combining sub-colors red, green, and blue (rgb). each sub-color's value can range from 0 to 255. thus (255, 0, 0) is bright red. (130, 0, 130) is a medium purple. (0, 0, 0) is black, (255, 255, 255) is white, and (40, 40, 40) is a dark gray. (130, 50, 130) is a faded purple, due to the (50, 50, 50) gray part. (in other word, equal amounts of red, green, blue yield gray).given values for red, green, and blue, remove the gray part. ex: if the input is 130 50 130, the output is: 80 0 80. thus, find the smallest value, and then subtract it from all three values, thus removing the gray.
Answers: 3
question
Computers and Technology, 23.06.2019 14:30
Norder to receive financial aid at his vocational school, mario must fill out the fafsa. the fafsa is a form that must be completed to determine . in order to complete a fafsa, you must submit . the fafsa can students obtain
Answers: 2
question
Computers and Technology, 24.06.2019 13:30
Type the correct answer in the box. spell all words correctly. what is the default margin width on all four sides of a document? by default, the document has a margin on all four sides.
Answers: 1
You know the right answer?
You have been tasked with building a URL file validator for a web crawler. A web crawler is an appli...
Questions
question
History, 02.12.2019 01:31
question
Mathematics, 02.12.2019 01:31
question
Mathematics, 02.12.2019 01:31
question
History, 02.12.2019 01:31
Questions on the website: 13722363