当前位置: 首页>>代码示例>>C#>>正文


C# IDocument.QuerySelector方法代码示例

本文整理汇总了C#中IDocument.QuerySelector方法的典型用法代码示例。如果您正苦于以下问题:C# IDocument.QuerySelector方法的具体用法?C# IDocument.QuerySelector怎么用?C# IDocument.QuerySelector使用的例子?那么, 这里精选的方法代码示例或许可以为您提供帮助。您也可以进一步了解该方法所在IDocument的用法示例。


在下文中一共展示了IDocument.QuerySelector方法的2个代码示例,这些例子默认根据受欢迎程度排序。您可以为喜欢或者感觉有用的代码点赞,您的评价将有助于系统推荐出更棒的C#代码示例。

示例1: ScrapDocument

        /// <summary>
        /// Scraps the given HTML document
        /// </summary>
        /// <param name="document">HTML document to be scraped</param>
        /// <returns>The scraped Apartment</returns>
        public override ApartmentDTO ScrapDocument(IDocument document)
        {
            var apartment = new ApartmentDTO();

            // Publication Data
            apartment.PublicationUrl = document.Url;
            apartment.PublisherCode = PublisherCode.Akelius;
            var urlMatch = Regex.Match(document.Url, @"/berlin/(\d+)");
            apartment.PublisherInternalId = urlMatch.Groups[1].Value;

            // Apartment's title data (rooms & neighborhood)
            var titleElementText = document.QuerySelector("header h2").TextContent;
            var titleMatch = Regex.Match(titleElementText, "(.*) Zimmer, (.*)");
            apartment.Title = titleElementText;
            apartment.Rooms = double.Parse(titleMatch.Groups[1].Value);

            // Appartment's size
            var sizeMatch = Regex.Match(document.QuerySelector(".fact-size").TextContent, @"(\d+)m");
            apartment.Size = short.Parse(sizeMatch.Groups[1].Value);

            // Apartment's availability
            var availabilityElement = document.QuerySelector(".fact-availability");
            if (availabilityElement != null)
            {
                var availabilityMatch = Regex.Match(availabilityElement.TextContent, @"Verfügbar ab (\d+\.\d+\.\d+)");
                if (availabilityMatch.Groups.Count > 1)
                {
                    apartment.AvailableFrom = DateTime.ParseExact(availabilityMatch.Groups[1].Value, "dd.MM.yyyy", CultureInfo.InvariantCulture);
                }
            }

            // Appartment's rent
            apartment.TotalRent = double.Parse(document.QuerySelector(".fact-netrent span").TextContent);
            foreach (var feature in document.QuerySelectorAll(".features .bundle"))
            {
                var label = feature.QuerySelector(".key .factlabel");
                if (label == null)
                {
                    continue;
                }
                if (label.TextContent == "Nettokaltmiete")
                {
                    var netValueElement = feature.QuerySelector(".value").TextContent;
                    apartment.NetRent = double.Parse(netValueElement.Replace("€", string.Empty));
                }
                if (label.TextContent == "Nebenkosten")
                {
                    var netValueElement = feature.QuerySelector(".value").TextContent;
                    apartment.Charges = double.Parse(netValueElement.Replace("€", string.Empty));
                }
            }

            return apartment;
        }
开发者ID:fedebertolini,项目名称:ApartmentAggregator,代码行数:59,代码来源:AkeliusScraper.cs

示例2: getContent

        public static string getContent(IDocument document, PageModel pageModel)
        {
            // find nodes for html, remove junk
            var jqContent = document.QuerySelector(pageModel.ArticleNodeSelector);
            jqContent.QuerySelectorAll(pageModel.ArticleRemoveSelector).ToList().ForEach(x => x.Remove());
            // remove scripts
            var allowedScripts = new[]
            {
                "platform.instagram.com",
                "platform.twitter.com"
            };
            foreach (var script in jqContent.QuerySelectorAll("script").ToList())
            {
                var src = script.GetAttribute("src").Safe();
                if (!src.Any() || (src.Any() && !allowedScripts.Any(x => src.Contains(x))))
                {
                    Console.WriteLine("     - Script blocked: " + src);
                    script.Remove();
                }
            }

            // mod images
            foreach (var element in jqContent.QuerySelectorAll("img").ToList())
            {
                // remove dimensions 
                element.RemoveAttribute("width");
                element.RemoveAttribute("height");
                // prefix sources
                var src = element.GetAttribute("src").Safe();
                if (src.Any() && !src.StartsWith("http"))
                {
                    src = pageModel.ImagePrefix + src;
                    element.SetAttribute("src", src);
                }
            }

            return jqContent.InnerHtml.Safe();
        }
开发者ID:dedabyte,项目名称:ScrapNews,代码行数:38,代码来源:Program.cs


注:本文中的IDocument.QuerySelector方法示例由纯净天空整理自Github/MSDocs等开源代码及文档管理平台,相关代码片段筛选自各路编程大神贡献的开源项目,源码版权归原作者所有,传播和使用请参考对应项目的License;未经允许,请勿转载。